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METHOD AND APPARATUS FOR OBJECT 
ORIENTED MULTIMEDIA EDITING 

A portion of the disclosure of this patent document contains material which is subject to 
copyright protection. The copyright owner has no objection to the facsimile reproduction by 
anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark 
Office patent files or records, but otherwise reserves all copyright rights whatsoever. 

A computer program listing Appendix that includes a software program code executable 
on computer as described below is part of this application. 

FIELD OF INVENTION 

The present invention relates to a method and apparatus for editing and creating a 
multimedia program in a visual workspace. 

BACKGROUND 

Computer-based editing systems for video and audio information are well known, and a 
number of such systems are commercially available. These typically allow the user of the 
editing (or production) system, who is referred to as an editor or producer, to create and edit 
programs from segments or clips of video or audio, and which usually also include text and 
graphics material. 

In addition, it is known to include what are called links or hyperlinks in audio and/or 
video multimedia presentations. These links allow the viewer or listener of the program to jump 
ftom one program segment to another at his/her election rather than to merely listen or watch the 
program in a conventional sequential order. Such links are well known in the computer context, 
but are not so limited. 

For instance, see U.S. Patent Application No. -09/320,1 32, filed May 25, 1999, entitled 
"Playing Audio Of One Kind In Response To User Action While Playing Audio of Another 
Kind", filed by William J. Loewenthal et al., commonly owned with this application; and also 
see commonly owned U.S. Application No. 09,272,633, filed March 18, 1999, entitled "Program 
Links and Bulletins for Audio Information Device", filed by William J. Loewenthal et al., all 
incorporated herein by reference in their entireties. The above-referenced applications disclose 
use of links between various data files which are programs, or parts of programs, which 
advantageously allow the user to traverse from one data file to another data file depending in 
some embodiments upon predetermined context related relationships. This is analogous to the 
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hyperlinks well known in the computer field in, for instance, websites. Here it is used in the 
context of a radio-type audio receiver, and is also adaptable for a television-type receiver 
("Player")- Of course, this requires that the receiver include local storage for storing the program 
contents so that the user can navigate between the program segments. Also, see U.S. Patents No. 
5,406,626; 5,524,051; 5,590,195, all issued to John O. Ryan, and incorporated herein by 
reference in their entireties. 

Hence, while such linked audio programs are known and also, in general, computer 
editing of video and audio programs is known, in the prior art construction of such linked 
programs requires substantial effort even using computer based editing. 

SUMMARY OF THE INVENTION 

The present system allows the producer to more easily produce the programs by allowing 
him/her to make the associations ("links") between and among program segments in a visual 
workspace. The program produced is thereby not unidirectional (sequential in time only). The 
system allows the producer to edit the programs into a program that does not playback from 
beginning to end in a unidirectional (time) line, but rather, a program that has associations that 
allow playback of the produced programs in an order that is determined by the listener (i.e., the 
programs are interactive). Unidirectional editing as described above is known in the prior art 
(see e.g., U.S. Patent No. 5,892,507, Moorby et al., incorporated by reference in its entirety). 
This is a type of editing in which multimedia data can be linked together to be played back in the 
form of a story from beginning to end along a unidirectional (time) line. 

The present system is an improvement on this time-based editing process that producers 
have used in the past. The present system provides a visual workspace, on a single screen, that 
enables the producer to isolate the segments comprising any program, to convert these segments 
into objects, to establish non-linear (multi-dimensional) relationships ("links") between and 
among any number of such objects, and to prepare multimedia programs. The benefits of the 
system over the well-known, time-based editing process include the production of programs 
more quickly, the reduced need for technical expertise on the part of producers, and the 
improvement in the quality of programs. 

Hence, there is provided here a system suitable for creating and editing such complex 
linked programs using a graphical user interface which allows easy linking by ordering icons 
depicting the various segments on a screen, and providing on the screen a spatial indication of 
same. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates a sample screen of the visual workspace. 

Figure 2 illustrates a linking of objects coded with the markup language. 

Figure 3 illustrates a sample view of the composer of FIG. 1, containing header and detail 
content elements. 

Figure 4 illustrates a sample of a "Player" interface on a general purpose computer that 
can be used in conjunction with this invention to play the produced program. 

Figure 5 illustrates a flowchart of the production of a program using the present system. 

DETAILED DESCRIPTION 

According to an embodiment of the present invention, multimedia programs are created 
and edited (collectively "produced") using computer software that allows multimedia content 
(e.g., audio, video, text, graphics) to be displayed as objects on a screen associated with a 
computer. A simple-to-use visual workspace (graphical user interface) is used to produce 
programs with complex associations. The software runs on, e.g., a standard computer executing 
the Windows 2000 operating system and is coded, e.g., in the C++ computer language. 

Figure 1 shows a sample screen used interactively by the producer (editor)^ who is a 
person. This screen 1 is displayed by the software executed on the computer. Coding this 
software in a suitable language is well within the skill of one of ordinary skill in the art in light 
of this disclosure. In one embodiment the software is an add-on module to the commercially 
available Multitrack Editor V3.98, part of the DigAS editing system available from D.A. V.I.D. 
GmbH of Germany. The producer views the multimedia files 2 or content in the content 
explorer 3. The content explorer 3 is a box on the screen that displays as icons the multimedia 
files 2 (program segments) available to the producer for editing. The producer can choose 
among available multimedia data files to produce programs. The file icons can be 
conventionally dragged and dropped (e.g., using a mouse associated with the computer) into the 
prompter 4, composer 5, and/or waveform editor 6 portions of the computer screen. The content 
explorer 3 contains icons for multimedia files 2, including audio, video, text and graphics files, 
as well as completed programs. This content (the actual program segments) could come from 
outside sources such as magazine articles, quiz shows, or audio/video sports and traffic updates. 
The content could also come from files or records in a computer database. Content is of the type 
available from LAN, local, and remote sources in, e.g., conventional .wav, .txt, ,mp3, .jpg, and 
.rtf data file formats. 
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The composer 5, as illustrated in Figure 1, provides the producer with a visual workspace 
to produce programs. The composer provides a visual representation of the structural parts of the 
program. The composer has three viewing boxes. In these different boxes, the producer can 
drag and drop multimedia files for viewing or editing. First, there is a multimedia viewer 7, 
which displays the types of files that are contained in each segment 10. Second, there is a 
navigational viewer 8, where the navigational (or multidimensional) structure of the program, or 
the sequence of segments, is created. Third, there is a view box 9, in which the producer can 
view and edit the text or image files without having to open the individual files. 

In the composer 5, the producer can view the segments 10 available for program 
composition. The composer allows the producer to view and edit segments in the navigational 
viewer 8. The story or content in each segment icon is labeled so that the producer can edit the 
program without having to click each segment icon to determine its contents. There are also 
conventional search mechanisms to find a segment by its label. This is especially necessary for 
larger programs. Each segment 1 0 is conventionally represented by an icon. An unlimited 
number of content elements 11 (rectangles) are available in the navigational view 8. When the 
number of content element 1 1 sets reaches the edge of the composer view, the producer is able to 
further view the content element 1 1 sets .beyond the edge of the view by conventional scrolling. 
The properties of the segments 10 in the content elements 11, such as the name or length of the 
segments, can be viewed in a multimedia viewer 7 by suitably selecting a segment. The 
producer can insert, delete, and/or append segments in the composer 5. The producer can review 
and listen to (or view) the content layers (discussed below) within segments in the composer 5 
using the view box 9 and the waveform editor 6. The producer can examine segments in any 
order, moving back and forth or up and down between content elements, as illustrated in Fig. 3, 
and as explained in detail below. The producer can also listen to or view header content 
elements sequentially in the composer. 

The segments 10 are placed in the content elements 1 1 . These content elements are 
shaped like rectangles in the composer's navigational viewer 8. The content elements 1 1 are 
empty until the producer places segments 1 0 within the content elements 1 1 . These content 
elements are arranged in levels. Figure 3 shows the arrangement of the content elements into 
header content elements 12 (top level) and detail content elements 13 (lower level). 

The content elements are either prime, meaning they contain only one segment, or are 
compound, meaning they contain more than one segment. A content element may contain any 
number of segments. Figure 1 illustrates examples of a compound content element 14 and a 
prime content element 15. When the content element contains so many segment icons that they 
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axe no longer viewable on the screen, the producer can scroll within the content element to see 
the multiple icon segments in the content element on the screen at the same time. The producer 
creates the associations between and among these segments, which here include associations 
other than merely sequential in time. 

As seen in Figure 3, one embodiment of the composer provides a template" that consists 
of content elements for header information of a story and content elements for the details of a 
story. There are three types of programs that can be produced. The most basic type is a one- 
level program. The end user (who ultimately plays the produced program) can determine the 
order this program is played, scanning forward or backwards through segments. A complex type 
of program is a two-level program. A two-level program has (linked) headers and details which 
are both content elements. A header provides, e.g., a short introduction to a story. A detail 
provides, e.g., the body of a story. The content elements (icons) are arranged in a template in the 
composer to make it easier for the producer to make the associations in the program. A template 
can be spatially arranged to have header content elements 12 on the top level and to have detail 
content elements 13 on the lower level of the composer's navigational viewer 8. 

Another complex type of program is a three-level program. (The number of levels is not 
limited.) A template could also be arranged to have a third level of detail content elements. The 
producer uses this third level of detail content elements to arrange segments in the program 
which further relate to the header or details of the story. For example, the third level contains an 
advertisement for the program, along with an option for the end user to make a purchase. In this 
example, the producer could also place a segment with a video clip in the third-level detail 
content element that would play once the end user made the purchase. This third level of content 
elements is associated with either or both of the other two levels of content elements. 

The producer can via the user interface make any number of associations between and 
among the header content elements and detail content elements. For example, a header content 
element could be linked with one or more detail content elements. A producer could link these 
by introducing on the screen an explicit relationship graphic line between the header and each of 
the detail content elements. Alternatively, a producer could make a duplicate header content 
element (using conventional shading or coloring on the icon to indicate it is a duplicate) and 
copy it on the screen to a location above each of the detail content elements that the producer 
intends to be associated with the header content element. One detail content element could also 
be shared by (linked to) multiple header content elements using similar linking techniques. 

The prompter 4, as shown in Figure 1 , allows the producer to create text content and open 
an existing text file to put into a program. The prompter 4 is for viewing and editing text. Text 
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can be imported from a text file or word document or composed directly in the prompter 4. The 
prompter 4 can receive files from the content explorer 3 and from outside applications. The 
prompter 4 allows the producer to mark parts of the text with tags 16 (Figure 3), The producer 
can tag parts of text for re-evaluation later in editing process. The tags 1 6 will remain in the 
prompter 4 as long as any of these remain in text. When the text is recorded into audio (using, 
for example, conventional text to speech methods), the recorded audio would be automatically 
separated into individual segments based on the tags 16 placed in the text 

As illustrated in Figure 1, the visual workspace also contains a conventional audio 
waveform editor 6. The waveform editor 6 allows the producer to edit audio files and segment 
audio programs. 

Segments of multimedia data are conventionally created by the producer. The producer 
places a content layer or multiple content layers of multimedia data in the segments. The 
producer can drag and drop multimedia data files 2 (content) from the content explorer 3 into the 
individual segments that make up a program. Figure 2 illustrates in a chart the content layers 18 
that make up a segment (Figure 2 is for conceptual purposes and is not part of the user interface). 
Each content layer contains multimedia data. Content layers 18 may contain either audio, video, 
text or graphics data. 

The editing system also enables the producer to apply differing conventional compression 
techniques to each content layer within each segment. Different content may need to be 
compressed at different rates in order to minimize bandwidth, consumption in the system. For 
example, if an audio layer contains music and speech, an author using the editing system can 
apply a different compression to the music and the speech. 

After the segments are edited by the producer in the visual workspace by manipulating 
their icons, in one embodiment the segments are coded in a markup language. This is done to 
provide an exportable file using the markup language (ML). MLs are routinely used in this 
context for exportable computer files. Figure 2 illustrates the associations that the markup 
language allows between the segments. The segments are coded by the ML so that they can be 
represented as objects 17. The objects 17 are segments, coded with the ML. After coding with 
the ML, the segment is an object 17, composed of content layers 18. These objects 17 contain 
the same audio, text, graphics or video content layers 18 that were in the original segments. This 
markup language establishes relationships between and among objects (using well-known 
hyperlinking technology). The ML also enables the graphical user interface ("GUT') on the 
associated "Player" to interact with the content layers of the objects. The Player is a receiver 
unit, as described in the above-referenced patent applications, to which an embodiment of this 
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invention delivers the programs edited by the editing system. The graphical user interface 
depicts a ML multimedia object composed of ML descriptions 19. Thus, the editing system 
allows the producer to assemble a markup language multimedia object from multiple source 
components. 

Once the program is completed, the program can be played in the visual workspace on a 
general purpose computer using a Player application program, or can be played using the 
physical dedicated Player device. The computer requires the Player application to play the 
program. An example of a Player application program interface 20 on a general purpose 
computer is illustrated in Figure 4. The Player interface 20 has a small screen and various 
controls as illustrated. The editor or end user can listen to the header of a story and can listen to 
the details of that story by pressing the "more info" button 20a on the Player interface 20. For 
example, if the end user is listening to a story about Tiger Woods, and mention of his equipment 
is made in that story, the end user can press the "more info" button 20a on the Player interface 
and hear about the equipment Tiger Woods uses and could also hear an advertisement for this 
equipment and even make a purchase of the advertised equipment. The user could also 
simultaneously watch footage of Tiger swinging a golf club in the viewing screen 20b of the 
Player. If the editor or end user does not want to listen to the details of a story, he can continue 
listening to the default program of continuous header story segments that the producer has 
produced using this invention. Other default programs may be produced; the producer could, for 
example, produce a program that would play the header and each detail of a story even if not 
prompted by the user. The dedicated Player is of the type described above and has similar 
functionality as interface 20. 

The present system allows the producer to more easily produce linked type programs by 
allowing him/her to make the associations between and among program segments in a visual 
workspace. The process of creating associations is made easier because the producer can not 
only arrange the stories in content elements, but can also see the segments and what they contain. 
Thus, for example, in a header content element, the producer can see there is a heading segment 
of a story about Tiger Woods winning a tournament,, and in the detail content element there is a 
segment about the details of the story about Tiger winning a tournament. Further, the producer 
can see that a segment (or segments) in a third-level detail content element contains an 
advertisement for golf clubs in which the end user could, for example, make a purchase or watch 
a video of Tiger using the golf clubs being advertised. 

Figure 5 illustrates a flowchart of the production of a program using the present system. 
First, program content or multimedia data files are stored in a local database 22 associated with 
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the editing computer. The producer uses these files to produce programs in the visual workspace 
21. When production is complete, the programs are saved in the database 22 and notification is 
sent to the ML generator 23 (which is the ML described above). The ML generator 23 retrieves 
the programs from database 22 and codes the program with the markup language. The resulting 
marked-up program is saved in a new database 24 also associated with the editing computer. 
The distribution system 25 is then notified that the program is complete. Distribution system 25 
is, e.g., a wireless broadcast system, such as the system utilized by the Player, as described in the 
above-referenced patent applications. Distribution system 25 can also be, e.g., an internet 
system, such as the system utilized by an internet based version of the Player. In addition, the 
distribution system 25 can be, e.g., a physical data storage media such as Compact Flash, Tape, 
or Compact Disk. Finally, the Player 26 receives content from the distribution system 25 and 
stores the content in its local database 27. The stored content is then available for interactive 
playback through the Player's 26 graphical user interface. 

The computer program listing Appendix, which is a part of this application, includes a 
description including computer code of the markup language described above, and is entitled 
"CAML: Command Audio Markup Language Specification." The Appendix also includes 
computer code, and descriptions thereof in the form of comments, to carry out the system in 
accordance with this disclosure. That description is entitled "On-Demand Authoring (ODA) 
Studio Design Specification." 

Having described the embodiments of the invention, it should be apparent to those skilled 
in the art that the foregoing is merely illustrative and not limiting, having been presented as an 
example only. Modifications and other embodiments are within the scope of one of ordinary 
skill in the art and are contemplated as being within the scope of the invention as defined by the 
following claims. 
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1. CAML Language 
l.l. Abstract 

The command Audio Markup Language (CAML) is designed to meet the Metadata definition needs of audio-centric interactive 
multimedia distributed over wireless digital audio broadcast (DAB) ayatens to mobile and stationary consumer devices. By 
Metadata, we generally mean data about content, such as the broadcast time of an audio program or that this edition of Night line 
«■ has five stories. Using CAML, an author of a CAML document can combine and interrelate multiple audio-centric media objects 
into compelling presentations. In automobile driver and other 'eyes on the road, hands on the wheel' environments, only the 
audio layers of the rich media are expected to be available to consumers. Thus, the a u d l o-centriclty of the specification (as 
opposed to video-centric definitions such as MPSG4 or MPBG7) 1b a core fundamental requirement. The media objects may include . 



WO 03/021416 



PCT/US02/27820 



interactive capabilities, but these should not mandate a driver's attention (e.g. to play a media selection or to complete an 
advertised purchase). In order to experience these multimedia objects independent of the broadcast schedule (i.e. as on-demand 
ox time shifted media), there is no requirement Cor real-time presentation. 

DAB systems are distinguished in that they are one-way. point to multi-point transmission systems. These systems have an 
inherent, fundamental advantage over point to point transmission systems (e.g. wireless internet systems) by allowing e single 
broadcast to reach an unlimited number of receivers at no incremental bandwidth cost. DAB systems also exhibit unique data loss 
characteristics (e.g., data loss when obstacles block reception) . 

The Caml language ia designed to meet the following criteria t 

1. The language definition is open, flexible, extensible and able to represent complex object relationships. 

2. A rich set of presentation capabilities is available to media authors. 

3. The authored media is suitable for distribution over wireless DAB systems to mobile devices. 

4. The presentation of CAML media is efficient and interoperable on low- power, portable consumer devices. 

5. To obtain optimal advantage of DAB systems, caml object distribution needs to be resilient to (some degree of) data loss. 

6. It is suitable to define both program contents and electronic program guides (the penultimate netadata for broadcast audio -or 
video- services) 

To achieve these criteria, the authors observed a similarity between the criteria for cakl and to internet streaming media 
technologies such as RealServer and Real Player by RealNetworks, Inc. and Windows Media Server and Player by Microsoft, Inc. For 
streaming audio, these Internet technologies are audio- centric. They may incorporate other madia as non-essential presentation 
elements. a noted difference is that internet streaming media allows a server to choose alternative compression streams based 
on the dynamic performance of the Internet connection to each client. With DAB systems, there is no variance of bandwidth 
(performance) on a per client basis. An additional distinction is the desire to enable consumer navigation capability over the 
audio-centric object. That is. consumers should be able to follow audio- centric document relationships using simple navigational 
interfaces which do not require visual clues such as underlined or color -based textual linkages requiring a mouse or keyboard 
centric interface. 

As a starting point, the authors chose the Synchronized Multimedia integration Language (SMIL) 2.0 specification. Thus, CAML 
contains a subset of the SMIL 2.0 features including basic layout, linking, mpfHa object, structure, and timing m od u les, and a 
set of CAMIi unique extensions that enable rich navigation and semantic linking between media segments. 

1.2. introduction 

This section is informative. 

The design of wireless digital audio broadcast systems has traditionally focused on delivering high-fidelity real-time audio to 
consumers in mobile and stationary environment g . Examples of such systems are* 

? the satellite digital audio radio system (SDARS) developed by XH Satellite Radio 
7 the SDARS developed by Siriuo Satellite Radio 

? the terrestrial in- band, on- channel system developed by iBiquity Digital Corporation 

? the European Telecommunications Standards Institute (BTSI) standard for Radio Broadcasting Systems? Digital Audio Broadcasting 
(DAB) to mobile, portable and fixed receivers. EM 300 401 VI. 3. 2 (commonly known as Eureka- 147) 

The Command Audio Markup Language was defined to provide a uniform standard- based definition which extends these systems to allow 
services that deliver additional, richer, but still audio- centric, media for time-shifted presentation. Such services are termed 
On -Demand Interactive Audio (DDI A) services. The Cam I, language serves as a standard representation for multimedia On- Demand 
Interactive presentations over dab systems. Note that systems which implement odia services can Include capabilities to 
represent both real-time and non-real-time content and to allow time-shifted presentation of both. 

CAKL content providers may distribute their content over a variety of channels, the digital audio broadcast channels mentioned 
above, conventional AM or PM analog channels, or two-way wireless or wired channels. BPO providers may distribute their content 
over the same or different channels as the content providers. 



CAML content providers may present their programs on a wide variety of clients, such as ODIA- enabled digital audio broadcast 
radio receivers, DAB receivers integrated with mobile phones, car navigation systems, and voice user agenta. Each of these 
platforms (or integration of platforms) has its specific capabilities and may require its own flavor of CAML presentation. CAML 
does not define the actual device presentation/ it defines the media object and relationships. 

To achieve its interoperability and portability goals, CAML includes a number of capabilities from the Synchronised Multimedia 
Integration Language (SMIL) 2.0 defined et http • //www. w3 .org/TR/2001/WD- ami 12 0-2 00103 01/ . The SMIL modularisation provides a way 
to create profiles (subsets) of the full SMIL language, in addition to providing the means to integrate SMIL functionality into 
other languages. In particular, the set of SMIL modules used in CAML represent the SMIL Basia Profile as defined at 
http?//www. w3.org/TR/200i/HD-smil20-20010301/smil-basic. html. But, SMIL 2.0 defines a number of capabilities (e.g. complex timing 
with video objects) which are not appropriate given the CAML objectives. Future editions of this document will Identify specific 
basic profile elements are not required by CAML. In addition to the smil modules, CAML also includes extensions that enable rich 
navigation and semantic linking between audio-centric media segments. 

This document presents a fixed CAML definition to maximise portability. We can anticipate that in the future, like SMIL profiles, 
a CAML profile for ODIA-enabled devices can be tailored based on the full CAML language with or without extensions to support 
application specific features. 

CAML documents represent either the metadata associated with media to be listened- to by consumers (e.g. editions of audio 
programs) or the metadata describing the selections of media available to consumers. The metadata describing selections is 
generally referred to as an Electronic Program Guide (or BPG) . Since the media experience of consumers of CAML described content 
is audio- centric, it is a complementary requirement that the bps also be audio- centric. That is, the program guide used for 
personalisation of a ODIA-enabled device should have a presentation which is consistent with all other CAKL objects. In this 
respect r there are similarities between an BPG for audio-centric services and an BPO for video-centric services. The TV- Anytime 
Porum is creating international consensus on definitions for systems which enable video- on- demand capabilities . These include 
MBTA definitions for a Program Guide. Where practical, we have chosen to adopt several of these features ae part of the CAML 
definition. 

It should be noted that the CAML definitions, smil definitions and TV- Any time definitions, per se, make no statement regarding 
IPR regarding implementation of systems incorporating these features. It is anticipated that this document (or future versions 
thereof) will be contributed to standardization bodies such as HorldDAB (http t//wvw.worlddab. org/ ) , the wireless Multimedia Porum 
(http i / /www. wnnnf orum.com/) . and TV -Any time Forum (http i //www. tv- anytime .org/ . Such submission makes no representation as to 
specific Command Audio or other organization's product capabilities, copyrights or intellectual property rights. 

For information and specifications regarding TV- Anywhere the reader ia referred to http •// www. tv- any time, org/ . 

For information and specifications regarding SMIL the reader is referred to http://www.w3.org/TR/REC-smil/2000/SMiL20/. 

That a player is CAML conformant means that it can play documents at least as complex as those allowed by the CAML language 
definition . 

This document is expected to expand in future version to incorporate meta information describing consumer metadata such as usage 
history, user preferences, and transactional media layers for interacting with 2-way backchannela . 



2. Design Rationale 
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ThiB section la informative. 

CAML language may be supported by a wide variety of ODIA -enabled playerB running on multiple hardware platforms. Mobile and 
portable devices share same cauumj cbaracterls tics < 

LCD displays display can render only texts in a small area (for example a radio with two lines of twelve characters each) . 

Simple input method i input devices may be functional, arrow keys, and a select key. Another may have a voice interpreter. 

Audio- centric x The primary interaction with players is audio only. That is, no textual or graphical viewing is required to 
interact with the object rendering. 

The CAML language aims to meet these requirements. 



2.1. Navigation 

A CAML document consists of one or more content items, represented in CAML language constructions. CAML constructions are used 
primarily to implement navigational capabilities of ODIA content. At the level of primary segments SMIL modules ara used. Bach 
SMIL module is an XML subtree which, taken separately, fully conforms to the definition of a SMIL 2 . 0 Basic Profile document . 
20 2.3. Layout 

Layout coordinates presentation of objects on the display device. Layout directives in CAML will be defined using SMIL modules 
(see Appendix A) The SMIL BaeicLayout Module is used and the more complex functionality of the other layout modules, such as 
hierarchical layout regions, is not supported. 

2.3. User Interface 

25 In an ODIA player device, a user would likely use radio-like dials to select the target that activates playback or linking. A 

•mouse- like" pointing cursor device is generally not supported. Nhile a user is handling the focus, the player may slow or pause 
its timeline; skip to the next segment r traverse to additional information and perform other navigational movements. 

2.4. Timing and Synchronization 

30 Timing end Synchronization aspects of CAML presentation are handled using SMIL nodules (Appendix A) . The SMIL Timing and 

Synchronization Nodules present dynamic and interactive multimedia according to a timeline. The SMIL timing model is expressed in 
a structured language. The timeline of a SMIL Basic Profile presentation may need to be processed with limited memory and 
processing resources of mobile devices. For example, recursive function calls caused by nesting elements and memory allocation 
for additional timelines should be restricted. To achieve this, the CAML language has restrictions on use of the SMIL Timing. 



The restrictions are* 

Begin or end conditions are not allowed. 
Hon- root time containers are not supported. 



Time attributes support the basic timing for an element. Timing attribute values support a synchronization-based timeline. Simple 
event timing ie useful and is usually easy to support, and so is included in CAML. 

2.5. Media Object 

Location of the media that constitute the contents of the presentation in CAML is defined using SMIL module BasicMedia. The ODIA 
45 multimedia presentation is composed of media such as audio, text, images, animations and video. 

2.6. Content Control 

The CAML language includes the SMIL BasicContentControl Module. For the sake of authoring consistency, a CAML document should be 
able to contain several presentations for different kinds of clients, controlled by switch elements end test attributes. 

50 

The player should analyze switch element syntax correctly even if it cannot recognize the test attributes' values. 

3. 7. Accessibility 

55 The document, "Accessibility Peatures of SMIL" [WA3- SMIL- ACCESS] , summarizes their recommendations on the SMIL 1.0 ISMILll . 

3. Definition of the CAML Language 
This section is normative. 
3.1. conformance criteria 

60 A CAML document is a "Conforming CAML Document" if it adheres to the specification described in this document including CAML DTD 
and also* 

? The root element of the document is a caml element. 

65 ? Conforms to the "Extensible Markup Language (XML) 1.0" [XML10J and « namespaces in XML" specifications, the document is well- 

formed. 

T A document must declare a default namespace far its elements with its xmlne attribute at tho caml root element. A CAML document 
is identified with httpi //www. tobedeterndned URI. 

For example i 

<caml xmlns**httpi//www.tcbedetermined"> 
</caml» 

If you wish to use SMIL modules that are not specifically included in the CAML language definition, you must identify them as 
being from the SMIL 2.0 namespace. Certain limits on CAML extensions should be applied (to be defined) 

80 A document may also Identify itself as a valid CAML XML document with a DOCTYPB declaration, although a CAML document must still 
include the above CAML namespace identifier. 

The CAML language DOCTYPE is* 
< 1DOCTYPK caml 
85 PUBLIC "-//W3C//DTD SMIL 2.0 BasiC//BW 

httpi//www.tobedatermined/CAML.dtd » 

This DOCTYPB declares a valid, extension- free, CAML document. 



The rules above will be updated once an XML Schema for CAML language is available. 
Conforming caml Players 

The CAML player is a program that can parse and process a CAML document and render the contents of the document onto an output 



The following criteria apply to •Conforming CAML Players 0 1 

The player must conform to functionality criteria defined in each module in ways consistent with this profile specification. 
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The player should support the semantics of all cakl language features. The player ignores unknown elements under support of skip- 
content attributes. The player ignores unknovn attributes. 

For an attribute with an unknown attribute value, the player substitutes the default attribute value if any, or ignores the 
attribute if the attribute has no default value. 
3.2. CAML Language 

The CAML language supports the lightweight multimedia features defined in the SMIL 2.0 specification. It Includes the following 
SMIL 2.0 Modules • 

SMIL 2.0 Layout Modules -- BasicLayout 
SMIL 2.0 Linking Modules BasicLinking 

SMIL 2.0 Media Object Modules -- BasicMedia and Mediae lipping 
SMIL 2.0 Structure Module -- Structure 

SMIL 2.0 Timing and Synchronisation Modules BasicInlineTiming, 

SyncbaseTinu.ng, BventTiraing, MinMaxTiming , and BasicTimeContainers 
SMIL 2.0 Content Control nodules -- BasicCoutentControl and 
SkipCantentControl* 



These collections of elements: are used in the following sections defining the CAML content model » 

Collection Kama Elements in Collection 

LinkAnchor a, area (anchor) 

MedaaContent Text, img, audio, video, ref. animation, texts tream 

Schedule Par, seq 

EMPTY 

Collections of attributes used in the tables below are defined as follows* 

collection Name Attributes in Collection 
Core Id, class, title 

Timing Begin, dur, end, repeet Count, repeacDur, rain, 
Max 

Description Abstract, author, copyright 

II an Xmlilang 

Test systemBitrate(systera-bitrate) , oystemCaptionu (system- captions) , ays temLenguage ( sys tern- language) , 

SystemOverdubOrSubtitle (system -over dub -or -caption) , 

Sys temRequi red (sys tern- required) , systemScreenSize (system- screen- si so) , 
SystemScreenDepth (system- screen -depth) , systemAudioDesc, 
Sys temOperatingSys tern, systemCFU, systenComponent 



3.3. CAML Data References 

CAML data references are special constructs that enable the CAML document to refer to specific parts of a chunk of binary data by 
specifying the starting offset and sise of the part being referred. The goal of introducing this construct. is to enable a CAML 
player to quickly locate binary data for media playback without having to scan large extents of binary data in search for 
specific boundary markers. It is recommended, therefore, that all binary data pertaining to a single CAML document be combined 
into one binary chunk that accompanies this document (although CAML allows to refer to more than one binary chunk using the 
chunks' IDs) . 

The specific mechanism of transferring binary data and associating binary chunks with CAML documents is implementation dependent; 
however, any conformant CAML player must be able to parse CAML data references and be able to locate the specified fragment of 
data in the binary chunk (s) which accompany the current CAML document. 

3.3.1. CAML Data Reference Attributes 

A CAML data reference consists of a number of attributes as defined below » 
arc 

The nam e of the binary chunk. The value of sre is normally equal to •camlithis", thus referring to the data chunk attached to the 
current CAML document- Other values of this attribute with *caml:* prefix refer to other binary chunks accessible to the 
document. Also, this attribute can contain any other valid URI which is processed according to the SMIL specification [SMIL2) . 

An example of a media object element with a arc attribute using CAML data reference • 

<audio id- w 9 n src-"oaml» this» camliOffset-»10S7914 , » caml» length-' 53360* caml«channelo»'»l» camltsamplerate<«"S000 1 ' 
camlibit8persaraple> ,> iS" caralt codec type- "PCM" /> 

3.3.2. Layers and blocks 

Audio layers consist of blocks which differ by the audio compression method used. Audio blocks are represented by <audio> 
elements and grouped together ueing«seq> elements corresponding to audio layers. 

Along with the attributes described in the SMIL specification, the audio elements may have the following CAML- specific 
attributes i 

id 

A unique identifier of the black in the CAML document 
camlt offset 

The starting offset of the block, in bytes, relative to the start of the CAML binary chunk. 



camlt length 

The length of the block in bytes, 
camlfplaytime 

The length of the block in milliseconds 
camlt channels 

The number of audio channels in the block, e.g. 1 for mono, 2 for stereo audio, 
cam! » samplerata 

The number of samples per second in the audio block, 
camltbl tapers ample 

The number of bits per sample in the audio block, 
cam! i codsctype 

The compression type of the audio block. Possible values are. 

The example of an audio layer with black information* 
<seq title-"Hi-l.waV> 
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<audio id-"9° arc- "caml » this" camliof foot- "10G7914" caml « length- "S3 360" c ami i channels- "1" caml .samplerate- "44100" 
canl.bitBpersainpleo'16- caml t codec type»"DVS I" /> 

< audio id»"l0" erc=" caml t this- caml iof fseto "1121274" canlt lengths" 53360" caml t channelso » i " caml*samplerate-"8000 a 
caml xbitopersanipleo'ie" caml: codec type- "Dolby" /> 
5 «/seq> 

3.4. Navigation Module 

The navigation nodal e io t-h fl top layer of a CAKL document providing means fox navigating across media segments (SMIL modules) . 

10 Elements Attributes Content Model 

Caml Core. IlBn, coral-length, dialect (content -it em*) 

Content -item Core, Z18n, type, con tent id, expiration- time, edition- time, announce- time 

15 (segment*) 

Segment Core, nan, next_story, prev_story, nextjri mesegment , prey^primesegment , down__level, up^level, first, last, level, 
ae s oc_con ten t_id , cif_type, EPG_tit le, "status jaask. availability_prompt, prompt_alias ~ Smil 

20 3.4.1. The caml element 

The caml element is the root element of a CAML, document. It contains content- item elements and has the following attributes t 
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caml -length 

Length of the CAML document in bytes. 



dialect 

The identifier of a CAML dialect used in this content item, e.g. % Eurekal47* . (reserved for future use) 
3.4.2. The content- instance element 

The content-item element represents any version of a program (as per TV-Anytitne) that has been created by applying minor 
30 editorial changes to the content, (o.g. editing out explicit material) It contains at least one segment element. 

type 

Content type, e.g. •program", % EPG" or *systen_pronpts" . 
35 contentid 

Unique ID of the content item identifying the program in the content database. 



expiration- time 

Time when the content io considered expired, in the format *HM/DD/YYYY HHjMM:SS" . 
edit ion- time 

Tine when the content item was last edited, in the format *MM/DD/rTYY HH:MM:SS». 



45 Time When the content item is to be available for transmission, in the format "MM/DD/YYYY hh:mm:ss* . 
3.4.3. The segment element 

The attributes of the segment element are defined as follows* 
down level 

50 The value of the id attribute of a segment element one level down from the current segment in a multi-level program. 
up_level 

The value of the id attribute of a eegment element one level up from the current segment in a mult i- level program. This attribute 
might be redundant (tbd) . 

previa tory 

The value of the id attribute of the first segment element in the previous story of the program. 



next_etory 

60 The value of the id attribute of the first segment element in the next story of the program. 



pxev_j>rimesegment 

The value of the id attribute of the previous segment element in the current story, or, if it is the first segment of the story, 
it is equivalent to prev_story. 

next jprime segment 

The value of the id attribute of the next segment element in the current story, or, if it is the last segment of the story, it is 
equivalent to next^otory. 



70 first 

The value of this attribute is equal to "yes" if It Is the first segment of a story, 
last 

Tho vslue of this attribute is equal to »yes* if it is the last segment of a story. 



level 

The value of this attribute is equal to 0 for segment elements belonging to the first level of a multi-level program, and it is 
equal to 1 for segment elements belonging to the second level. 



80 3.4.3.1. The segment element of BPO 

The segment element of a content item of type "EPG* has additional attributes: 



assoc_cont ent_id 

Contentid of a topic/program associated with the content of the segment. 



cif_type 

Content class type, which can have the following values i 
T Program 
? EPQ 

90 ? Systemjprompt 

? Traffic_bulletin 
? Kew8_bulletin 
J Emergency bulletin 
? undefined 



status mask 

A flag defining specialised rendering of the epg item associated with the segment. Possible values are* 

T dieable_save 

7 invisible_on_epg 

aval 1 abi li ty_promp t 
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Reference to tha system availability prompt (id of the segnent containing the prompt in *system prompts" program) 
3.4.3.2. The segment element of System Prompt program 

The segment element of a content item of type 'system ^prompts* has two attributes in addition to attributes common to all content 
items i 

prompt_alias 

An alias name of the system prompt associated with the eegment . 
a osoc_con ten t__id 

Content Id of a prompt associated with the content of the segment. 



4. A CamL document example 



The structure shown on this diagram is implemented in the following CAML documents 
«?xml version*"!. 0" encoding- "ua-aecii" ?> 

- ccaml versions "1.12 " xmlns i carol- "http i //www, cormnandandlo . com/caml " dialect =" navigator* caml- lengths "5444" > 

- -eContent -item type- "program" titles "TestCaseS" contention » 99 9 9 9" expiration- time." 06/ 3 0/2 003 OOiOOtOO" edition-time="07/23/2001 
23*43:22" announce- time-" 07/23/2001 23»43s22"> 

- < segment id="3" next^story."!?" next_pr irnesegmento " 17 ° dovn_l e vel = 0 7' level-"0" lasto"yes* fir8t- # yes"> 

- <smil xmlns- "http i //www. w3.org/TR/RBC-smil"> 

- <hody> 

- <par> 

- <seq title-"hl_l.txt««> 

<text id-"4" src- • caml i this" carol tof f set- "0" camljlengch»°127'' /> 
</seq> 

- <seq titles "hl__l.wav*»> 

<audio id»" 6" src- " caml j this " caul i offset- "12 7" comls length- "1B91020" caml i channels-" 2" caml i samplerate-"44100" 
caml:bitspexsample» ,, 16«* caml scodectype= B PCN B /> 
</seq> 
</par> 
</body> 
</smil> 
</ segment > 

- < segment id- "7" next_jBtory-"17" next_priiieoegment-"ll" level- "1" first-"yeo"> 

- <smil xmlns* "ht tp «//wW. w3.org/ TR/REC-smil"> 

- <body> 

- <par> 

- <seg titleo»dl_l.txt»> 

<text id«»8» src«"camlithi8» caml i of fact- "18 91147" caml* lengths" 179" /> 
</seq> 

- «eseq title- "dl_l.wav"*. 

< audio id= " 10 B ~"src-" carols this" carols offset- "189132 6" caml i length-" 22 S 1372" caml i channels- "2*" caml t sampler ate- "44100" 
camlibitspersample-"16 H caml » codec type- "PCM" /> 
</seg> 
</par> 
</body> 
</smil> 
</ segment > 

- < segment id- "11" next__story- n 17° next_primesegment-"17" prev_primesegmento "7» level-" 1" lasts*yes"> 

- <srail xnanBo"httpt//www.w3.org/TR/RKC-smil»> 

- <body> 

- <par> 

- <seq title-»dl_2.txt»> 

<text id-" 12" Bro"camlsthis" caml t offset** "41726 98" caml i length- "107- /> 
</seq> 

- <seq title-"dl_2_S.wav*> 

<audio idV»l4» are- " caml i this" camlsof f eet« B 4172805° caml i length- "429248* caml i channel b-" 2° carol isaroplerate-" 44 100* 
caml :bitspersaraple-"16 n caml t codectype- "PCM" /> 
</seq> 
</par> 
«/body> 
</smil> 
</ segment > 

- < segment id="17» next_8tory»"31" nextjpriroeBegment- 0 21* prev_primesegment= n 3 " prev^Btory-BS'' down_level«"25" level- n 0* 
first- "yes "> 

• csmil xmlns- "httpi //www. w3.org/TR/R2C-smil"> 

- <body> 

- <par> 

- «seq title- "h2 l.txt"> 

<text id-"iB» src<=" caml t this" caml » off set=»46020S3» caml » length." 76" /> 
</seq> 

- <seq title-"h2__l.wav"> 

<audio id»"20" src- "carols this" caml* off set- "4602129" carols length- "307388" caml»channelB-»2" carol ssaroplerate- "44100" 
camlsbit8persample-"LS n caml s codec type -"PCM" /> 
</Beq> 
</par> 
«/body> 
</smil> 
«/ segment > 

- < segment id- "21" next_scory-"3l" next __primesegment° 0 3l n pr ev_j> rime segment- "17 " prev_story-"3" down_level»"2S" level*." fl- 
ies t- "yes" » 

- <smil xiulns=. "http » //www. w3.org/TR/R2C-smil"> 

- <body> 

- <par> 

- <seq title»"h2_2.txt"> 

<text id-" 22" ere- "caml. this" caml. of f set- "4 909517" caml .length- "117- /> 
</seq> 

- <seq titlo-" fa3_2.wav"* 

<audio id- "24 """arc- • caml s this" camlioffset-"4909634» carol t length- "14 93 416" camlichannels»"2" ca«ateaBm>lerate«"44100" 
caml ibitsper sample. "16" caml i codectype- "PCM" /> 
</seq> 
</pax> 
«/body> 
</smil> 
</ sequent > 

- < segment id- "25" next_6tory-"3l" next_primesegmant-"31" prevjprimesegmsnto "3 " prev etory**"3" level- "1" last- "yes" first="yes"> 

- «smil xmlns -° http./ /www. w3 . org/TR/R£C-smil n > 

- <body> 
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- «por> 

- cseq titles "d2_l.txt" > 

<text id-"26" arc«"caml«thi«" caml • of Eset»»640305o° caml i lengths " B8 " /> 
</seq> 

5 - <seq title-"d2_l.wav"> 

< audio id>»28» orc-« , caal»thio" coml toff sot- "640313 8 " caml , length- "4 3864 A0» caml . channels- »2" caml . camplora to- -44100" 
carol tbitspersaraple^-ie- caml : codectypc- ■ POI" /> 
c/seq> 
c/par> 
10 c/body> 
c/smil> 
c/ segment > 

- <oegment id-" 31° next_primesegi»nt-"35 a prev_j>rim8Segment«»"17" proves tory-" 17 ■ down_levei « "39' level- "0" f irst=»yeB-> 

- cemil xmlno-"Iittpt//www.w3 .org/TR/RBC-omll»> 
IS - *body> 

- cpar> 

- cseq titles»h3_l.txt"> 

ctext id-"32- arc- •caml* this- caml i of foe t=" 107 B 9613 " caml i length- "105" /> 
c/seq> 

20 - «oeq title-»h3_l.wav"> 

caudio Ida" 34 ■ src=»°caml > this" caml toff Beta -107 8 9723 ° camlilengtha n 490536" caml* channels* "2" caml isamplerote^'^icio- 
caml tbitepersample=°16" carol *codactype«" PCM" /» 

</eeq> 

c/par> 
25 «/body> 

«/sodl> 

<J segment > 

- < segment Ida "35" prevjprimesegiiento"3l' prev_8tory-"l7" down_levele"39" level-" 0* last°"yes"> 

- <:ST&il xmlns-"httpi//www.w3 . org/TR/RBC- soil 11 > 
30 - <body> 

- <pax> 

- <aeq title- "h3 2.txt*> 

ctext id- B 36" arc- ■caml* this- caml t offset- "113 8 0259" caml i length-" 147- /> 
</eeq> 

35 - <seq title="h3_2.wav"> 

caudio id»"3B" src= 0 caml t this" caml i offset- "11280406 * caml t lengths «3 55572 ■ caml t channels- ■ 2* caml isamplerate» "44100° 
caml i bitsper samples "16" carol tcodectype=" PCM" /> 

c/seq> 

</par> 
40 c/bcdy> 

</Bndl> 

c/oegment> 

- caegment id»"39» next_primeaegment- "43 " prev^priroesegment- 0 !?'' prefatory-" 17" Level- "I" f±rBt»"yes"> 

- cewil xralna- "http://www.w3 .org/TR/RBC-omil a > 
45 - cbody> 



75 



85 



95 
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- cseq title-"d3_l.txt"> 

ctext id=»40» arc-" caml * this" caml » of f seta "11 63 597 8" caml»leagth»"53" /> 
c/aeq> 

50 - cseq title-"d3Jl.wav»> 

caudio id- "42" ore- "caml * thia" caml lof f set- "1 1636031 11 carol i length- "2 B 03056" carol t channels- "2" caml teamplarate- n 441O0" 
caml ihitspersample- n 16 a caml »codectype«° PCM n /> 

</seq> 

c/par> 
55 c/bcdy> 

c/amil> 

</ segment > 

- csegment id- "43" prev_primeaegroant-"39* prev_Btoxy-"17» level-"!" laat««yes*> 

- csmil xmlns-"http i //www. w3.org/TR/RBC-smil"> 
60 - cbody> 

- cpar> 

- cseq title-»d3_2.txt"> 

ctext id-"44" ere-" caml « thia n caml. off set- "1443 9087" caml •length-* 229" /> 
c/seq> 

65 - ceeq title- "d3_2 .wav"> 

caudio ido"46" sre- ' caml i this" caml > of faet-" 1443 9316". caml t length- "56 2 692" caml : channels- "2" caml : aamplerate- ■ 44100" 
caml t bi taper samples "16 u caml t codectype- ■ PCM" /> 
c/seq> 
</par> 
70 </body» 
</smil> 
c/ segment > 
< /content- ltem> 
</caml> 



5. CAML Language Document Type Definition (DTD) 
This section la normative. 



«! — file, caml.dtd 

Thia is Command Audio Markup Language (CAML) 

Copyright 2001 Command Audio Corporation. All Rights Reserved. 



Permission to use, copy, modify and distribute the CAML DTD and its accompanying documentation for any purpose and without fee is 
hereby granted in perpetuity, provided that the above copyright notice and thia paragraph appear in all copies. The copyright 
90 holders make no representation about the suitability of the DTD for any purpose, it is provided "as is" without expressed or 
implied warranty. 



Author i CAC 

Revisioni aidt caml.dtd,v o.i 2000/11/10 llii6»46 Exp $ 
Please uae thia formal public identifier to Identify this DTD* 
■-//CommandAudio//DTD CAML//KH" 
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<l- First: we include the necessary SMIL modules. 
caml support 8 the lightweight roulti media 
features defined in SMil language. This profile includes 
the following SMIL nodules i 

SMIL 3.0 SasicLayout Module 
SMIL 2.0 BasicLlnXing Module 

SMIL 2.0 BasicMedia nnd Media dipping Modules 
SMIL 2.0 Structure Module 
10 SMIL 2.0 Basic InlinTindng. SyncbaeeTimiag. BventTiming . 

MlnMaxTintLng and BaaicTimeContainers Modules 
SMIL 2.0 BasicCon tent Control and SklpContentControl Modules 

— > 

15 < l ENTITY % IIS. prefixed "IGNORE" > 

< J ENTITY % SMIL. prefix > 

« | -- Define the Content Model — > 
< I ENTITY * smil -model .mod 
20 PUBLIC »-//W3C// ENTITIES SMIL 2.0 Document Model 1.0//EN" 

• smil -node 1-1. mod" > 

< 1 - - Modular Framework Module 

ENTITY % smil- framework .module 0 INCLUDE • > 
25 <l [% smil -framework. module; ( 

< ! ENTITY % smil- framework. mod 

PUBLIC " -//W3C// ENTITIES SMIL 2.0 Modular Framework 1.0//EN" 
"smil - framework- 1 .irod" > 
tsmil -framework .mod; I ) > 

30 

< I ENTITY % layout -mod 

PUBLIC »-//W3C/ /ELEMENTS SMIL 2.0 Layout//EN" 
« SMIL- layout .mod" > 
% layout -mod r 

<1 ENTITY % link-mod 

PUBLIC "-//W3C/ /ELEMENTS SMIL 2.0 Linking//BN" 

■SMIL- link .nod* > 
* link -mod, 

40 < i entity % BasicLinkingModule -included 

< 1 ENTITY % media-mod 

PUBLIC "-//W3C/ /ELEMENTS SMIL 2.0 Media Qbjects//EN« 
"SMIL-media .mod»> 
45 Imedia-mod,- 

<! entity * struct -mod 

PUBLIC " - / / W3 C/ /ELEMENTS SMIL 2.0 Document Structure/ /EN" 
"SMIL- struct .mod"* 
50 % struct -mod/ 

< I ENTITY % timing-mod 

PUBLIC »-//W3C/ /ELEMENTS SMIL 2.0 Timing//EN" 
"SMIL- timing . mod" > 
55 % timing -mod, - 

< I ENTITY % control -mod 

PUBLIC "-//W3C//ELBMENTS SMIL 2.0 Content//EN" 
" sm il- cant rol . mod" > 
60 % control-mod; 

< I -Now we define the CAML elements — » 

65 <1 ELEMENT caml ( content - item+ ) > 

< 1 ATTLIST caml %Core .attrib 
% II Bn. attrib; 

caml-length CDATA § REQUIRED 

dialect CDATA # REQUIRED 



35 



70 



Caml Core, lien, caml-length, dialect 

Content-item Core, IlBn, type, contentid, expiration- time, edition-time, announce- time 



75 Segment Core, I10n, next_otory, prev story, next_primesegment, prev_p rime segment, down_level, up_levei, first, last, level, 
ass oc_con ten t_i d , df type, BPG_title t *"status_mask, availability_prompt, prompt_alias 

80 

< I ELEMENT content -item (segment*) > 
< l ATTLIST content-item 
%Core. attrib; 

%IlBn. attrib; 

85 type CDATA # REQUIRED 

contentid CDATA 8 REQUIRED 

prime- segment -count CDATA # REQUIRED 

expiration- time CDATA f REQUIRED 

edition- time CDATA #R 

90 announce-time CDATA *E 



c (ELEMENT segment (stall) > 
< (ATTLIST segment %Core. attrib; 
95 %H8n.attrib; 

up_level IDREF ft IMPLIED 

down_ level IDREF ft IMPLIED 

previa tory IDREF ft IMPLIED 

next"story IDREF 0 IMPLIED 

100 nextJP riroe8e 3 n e nt IDREF # IMPLIED 

prevjpriraesegmant IDREF # IMPLIED 
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firot 'yes' # IMPLIED 

last *yes* 9 implied 

* level CDATA ^IMPLIED 

BSSOC_contenC_id IDREP 8 IMPLIED 
cif_type CDATA § IMPLIED 

EPG_title CDATA IMPLIED 

statue maaJc CDATA * IMPLIED 

availabllityjprorapt CDATA (J IMPLIED 
prompt alias Cdata 8 implied 



<l-end of CAML DTD . 



15 An XML Schema for CAML will be made available. 
Appendix A 

14. SMIL 2.0 Basic Language Profile 

20 Copyright 2000 world Wide Web Consortium (Massachusetts Institute of Technology, Institut Rational de R e ch erche en Infornatique 
et en Automatique, Keio University. All Rights Reserved, http://vww.w3 .org/TR/smil20/snil-baaic.html 

Editors 

Kenichi Kubota (kubokenOisl.mei.co.jp), Panasonic 
25 Aaron Cohen (aaxon.ra.cohenOintel.com) , Intel 

14.1. Abstract 

The SMIL 2.0 Basic language profile defines a profile that is a baseline subset of the SMIL 2.0 language profile specification, 
and ig designed to be a profile that meets the needs of resource- constrained devices such as mobile phones and portable disc 
30 players. It contains a baseline set of the SMIL 2.0 features including basic layout, linking, media object, structure, and timing 
modules. 

14.2. Introduction 

This section is informative. 

35 The Synchronised Multimedia Integration Language (SMIL, pronounced "Braile") includes powerful functionality for multimedia 
services not only on desktops but also eor information appliances. The SMIL 2.0 Basic language profile should serve such a 
ubiquitous network as a baseline for audio-visual presentations. 

SMIL content authors may desire their works to be available on a widespread variety of web clients, such as desktops, television 
40 sets, mobile phones, portable disc players, car navigation systems, and voice user agents. Bach of these platforms has its 
specific capabilities and may require its own profile. The SMIL modularization provide a solution to create profiles and 
extensions of the full SMIL language profile, in addition to providing the means to integrate SMIL functionality into other 
languages . 

45 The SMIL Basic language profile consists of a small number o£ modules chosen from the complete set of SMIL 2.0 modules to provide 
a baseline in terms of semantics and syntax, and assures conformance to the larger SMIL language profile specification. A profile 
for mobile and portable devices can be tailored based on the SMIL Basic language profile with or without extensions to support 
application specific features. 

50 The smil Basic Language profile does not propose to restrict extension, but aims at a baseline of conformance between the full 

SMIL language profile and one appropriate for mobile and portable services. Saying that a player is SMIL Basic conformant, means 
that it can play documents at least as complex as those allowed by the SMIL Basic language profile, it may play significantly 
more complex documents. In particular, the browsers conforming to the SMIL 2.0 language profile will be automatically SMIL Basic 
compliant. Thus SMIL authoring for low power devices is easier since general tools can be used. 

55 14.3. Design Rationale 

This section is informative. 

SMIL Basic language profile is a language profile that is SMIL Host Language conformant and may be supported by wide variety of 
SMIL players, even those running on small mobile phones. Mobile and portable devices share some common characteristics t 
? small display > Display can render texts, images, audio and stream data in a small area. 
60 ? Simple input method. Input devices may be numeric keys, arrow keys, and a select key. Some may have a pointing cursor. Another 
may have a voice interpreter. 

? Real-time embedded 0S» Resources for calculation is limited by priority order of each task. So, in a SMIL player, the use of 
timers should be restricted in number and frequency, and memory should be used sparingly. 

? Less network transaction! Network transactions, if necessary in a wireless environment, should be reduced as much as possible. 
65 The SMIL Basic language profile aims to meet these requirements. 

Por references, see the document, "HTML4. 0 Guidelines for Mobile Access- [MOBILE-GUIDE] issues guidelines for HTML content, and 
"XHTML Basic" tXHTML-BASIC] provides a minimum subset for portability and conformance. 

14.3.1. Layout 

70 Layout coordinates presentation of objects on the display device. Presentation on a small display has difficulty in rendering 
some objects. Layout of objects may not be able to be flexibly adjusted, nor have scroll bars. So, the layout should be simple 
and effective. Often the root- layout window will represent the full screen of the display. The BasicLayout Module is used and the 
more complex functionality of the other layout modules, such as hierarchical layout regions, is not supported. 

75 14.3.2. User Interface 

On a SMIL Basic player window, a user would likely use arrow keys to move focus on objects and anchors, and select the target 
that activates playback or linking. A "mouse-like" pointing cursor device might not be supported. A •move focus and select" is a 
simple user interface for communication with the smil player. The "mouse -click" user action can be mapped to an -activate" 
80 action. While a user is handling the focus, the player may slow or pause its timeline. 

14.3.3. Timing and Synchronisation 

The SMIL Timing and Synchronisation Modules present dynamic and interactive multimedia according to a timeline. The SMIL timing 
model is expressed in a structured language. The timeline of SMIL Basic presentation may need to be processed with limited memory 
85 and processing resources of mobile devices. Por example, recursive function calls caused by nesting elements and memory 
allocation for additional timelines should be restricted. 



To achieve this, the SMIL Basic language profile has restrictions on use of the SMIL Timing. The restrictions are* 
Time attribute values only allow single begin and end conditions. Lists of begin or end conditions are not allowed. 



09 In addition, the SMIL Basic profile may need to have no concept of a time container except a root time container. A time 
container groups media objects into a basic unit within a local time space, and this may increase the processing complexity of 
the document beyond device capabilities, if the document includes time containers, the nesting depth of time containers may need 
to be limited. Also it may be required to limit the number of character a or elements. The SIMM working Group requests feedback 
95 from users or SMIL Basic on these points. 

Time attributes support the basic timing for an element. Timing attribute values support a synchronization- based timeline. 
Enriched players may support event values, for example, like "click" to start presentation. This kind of simple event timing is 
useful and is usually easy to support, and so is included in SMIL Basic profile. 



100 



14.3.4. Media Object 
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The BasicMedia Module specifies the location of the media that constitute the contents of the presentation. Here we assume the 
presentation to be an audio/visual multimedia presentation composed of m ed ia such as text, images, audio and video. 

14.3.5. Content Control 

The SMIL Basic profile includes the BasicContentControl Module. For the sake of authoring consistency , a smil document should be 
able to contain several presentation Ear different kinds of clients, controlled by switch elements and test attributes. A client 
application may not need to directly support content control mechanisms, since the content control processing of the document may 
be done on the way to the client. When SMIL Basic is used in this way, the player may be thought of as the Bye ten composed of the 
combination of the server and the client. 

Test attributes of the BasicContentControl Module are included in the SMIL Basic language profile, even when an alternative to 
the cc/PP mechanism is necessary. The player should analyse switch element syntax correctly even if it cannot recognise the test 
attributes . 

15 Some functionality of the Content Control Modules may be extended with CC/PP I CC/PP J mechanisms, which provide a way to notify 
device capabilities and user preferences to an origin server and/or intermediaries such as a gateway or proxy. This allows the 
generation or selection of device- tailored contents. Thus CC/PP can be used with transformation of available documents between 
client and server. 

20 14.3.6. Accessibility 

There are useful guidelines to handling accessibility issues that can be applied to the profile design for information 
appliances. The document, "Accessibility Features of SMIL" [WAI- SMIL- ACCESS] , summarises their recommendations on the SMIL 1.0 
ISMIL1] 

25 14.3.7. Use of the SMIL Basic language profile 

SMIL Basic language profile is designed to be appropriate for small information appliances like mobile phones and portable disc 
players. Its simple design is intended to serve as a baseline profile for application specific extensions. Similar to the 
XHTML+SMIL language profile, an HTML+SMIL mobile profile could be written that is an integration of SMIL Basic timing and media 
functionality into the HTML mobile profile. 

30 

The SMIL Basic language profile does not restrict extension and inclusion of other modules, for example, the Animation, 
Metainformation, Transition, or additional Timing and Synchronisation Modules. A language may support additional features and 
modules and still be conformant with the SMIL Basic language profile. 
14.4. Definition of the SMIL 2.0 Basic Language Profile 
35 This section is normative. 

14.4.1. Conformance Criteria 
Conforming SMIL 2.0 Basic Documents 

A SMIL 2.0 Basic document is a "Conforming SMIL 2.0 Basic Document" If it adheres to the specification described in this document 
40 including SMIL 2.0 Basic dtd and also. 

The root element of the document is a smil element. 

Conforms to the -Extensible Markup Language (XML) 1.0" [XML10] specification, the document is well-formed. 

45 A document must declare a default namespace for its elements with its xmlns attribute at the smil root element. The SMIL 2.0 
Basic language profile document is identified with »httpi//www.w3 -org/Ta/REC-smil/2000/SMIL20/Basic'» URI. For example. 
<smil xmlns=-http » //www .w3 . org/TR/RBC-smil/2000/SMlL20/Basic- > 

</smil> 

50 To use modules that are not specifically included in the SMIL 2.0 Basic language profile, they must be identified as being from 
the SMIL 2.0 namespace. 

For example, a SMIL 2.0 Basic document extended to use the brush element from the SMIL 2.0 BrushMedia Modules 

< stall xmlns»"http«//wwtf .w3 .org/TH/RBC-smil/2000/SMIL20/Basic" 
55 xmlns tsmil2-"hctp. //www. w3 .org/TR/REC-6mil/2000/BMIL20/"> 

<smil2 thrush color-'red" begin»"10s" dura"20s"/> 
</smil> 

60 A document may additionally identify itself as a valid SMIL XML document with a SMIL DOCTYPE declaration, although a SMIL 2.0 
Basic document must still include the above SMIL 2.0 Basic namespace identifier. 

The SMIL 2.0 Basic language profile DOCTYPE ± 0 . 

< \ DOCTYPE SMIL 

65 PUBLIC »-//W3C//DTD SMIL 2.0 Basic//BV" 

• ht tp i //www . w3 . org/TR/ REC- smil/ 2 0 0 0/8M IL2 OBasic . dtd" > 
This DOCTYPE declares a valid, extension- free, SMIL 3.0 Basic document. The rules above will be updated once an XML Schema for 
SMIL 2.0 Basic language profile is available. 

70 Conforming SMIL 2.0 Basic Players 

A SMIL 2.0 Basic player is a program that can parse and process a SMIL 2.0 Basic document and render the contents of the document 
onto an output medium. 

75 The following criteria apply to a "Conforming SMIL 2.0 Basic Players* i 

? The player must parse a SMIL Basic document and evaluate its well-formedness conformant to the XML1.0 Recommendation [XML10] . 

7 The player must conform to functionality criteria defined iE each module in ways consistent with this profile specification. 

? The player should support the semantics of all SMIL 2.0 Basic language profile features. 

? The player ignores unknown elements u n de r support of skip-content attributes. 

85 ? The player ignores unknown attributes. 

> For an attribute with an unknown attribute value, the player substitutes the default attribute value if any, or ignores the 
attribute if the attribute has no default value. 

90 14.4.2. SMIL 2.0 Basic Language Profile 

The SMIL 2.0 Basic language profile supports the lightweight multimedia features defined in the SMIL 2.0 specification. This* 

language profile includes the following smil 2.0 Modules. 

? SMIL 2.0 Layout Modules BasicLayout* 

? SMIL 2.0 Linking Modules -- BasicLinking* 
95 ? SMIL 2.0 Media Object Modules — BasicMedia* and MediaClipping 

? SMIL 2.0 Structure Module — Structure* 

? SMIL 2.0 Timing and Synchronisation Modules — BasiclnlineTlming*, syncbaseTimlng* , BventTiming , MinMaxTiming*, and 
BasicTimeContalnera* 

? SMIL 2.0 Content Control Modules BasicContentControl* and Skipcon tent Control* 
100 (*} - required nodules in order to be SMIL Host Language Conformant. 

These collections of elements are used in the following sections defining the SMIL 2.0 Basic profile's content roodal 1 
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Collection Name Elements in Collection 

LinkAnchor a, area (anchor) 

MediaContent text, img, audio, video, ref, animation, text stream 
5 Schedule pax, seq 

EMPTY 

Collections of attributes used in the tables below are defined as follows ■ 

10 Collection Name Attributes in Collection 
Core id, close, title 

Timing begin, dar, end, repeat (deprecated) , repeatCount , repeatDur, rain, max 
Description abstract, author, copyright 

IlBn • xmltlang 

15 Test systemBitrate(systera-bitrate) . sys temCapt ions (system- cap t ions ) , systeniLanguage( system-language) , 

eye teraOverdubOrSub t i t le { aya tent- overdue - or- cap t ion) , ay a temRequi red (oyatem- required) , systenScreenSise (system- screen- size) , 
sy s temScreenDep th (oys tem- screen- depth) , systeraAudioDesc, systemOperatingsystem, systemCPU, systemComponent 
14.4.3. Layout Nodules 

20 The Layout Nodule provides a framework for spatial layout of visual components. The SMIL 2.0 Basic profile includes the SMIL 2.0 
BasicLayout Module. This profile requires layout type of "text/smil -basic -layout" . 

Elements Attributes Content Model 

25 layout Core, HBn, type < root- layout?, region* ) 

region Core, I18n, backgroundcclor (background-color) , height, width, bottom, fit. left, right, top, s-index, showBackgroend 



65 



root-layout Core, XL8n, backgroundColor (background- color) , height, width EMPTY 

30 Default of the sise and position attributes is the full screen available to a client player. The default value of the fit 

attribute is "hidden". So when a player cannot layout a media object of target, then it presents the object as is inside the 
region. 

14.4.4. Linking Modules 

35 The Linking Module describes the hyperlink relationships between content. The SMIL 2.0 Basic profile includes the GMTL 2.0 
BasicLinking Module. 

A SMIL Basic player may not be able to control links between the source and the destination object, for example, in applying 
sourceLevel and destinationLevel attributes to volume. In this case it is permitted that the attributes of target control, which 
40 are listed in the table below, are just ignored, as a consequence, the attributes of the SMIL 2.0 BasicLinking Module might be 
replaced with their default attributes. The default value for show attribtue is "replace" and the sourcePlaystate attribute is 
ignored, see their definitions for detail. 

The area element and the deprecated anchor element may not need to be supported if the device does not have an appropriate user 
45 interface, in which case the SMIL Basic player should play the presentation without the area map. 

Elements Attributes Content Model 

a Core, I18n. href, show, sourcePlaystate, destinationPlaystate, sourceLevel, destinationLevel, accesskey, tabindex, 

target, external, actuate MediaContent* 
50 area (anchor) Core, IlBn, Timing, alt, coords, href, show, sourcePlaystate, destinationPlaystate, sourceLevel, 

destinationLevel, accesskey, tabindex, target, external EMPTY 

14. 4. 5. Media Object Modules 

55 The Media Object Modules provide a framework for declaring media that constitute the contents of a SMIL presentation, and specify 
their locations. The SMIL 2.0 Basic profile includes the SMIL 2.0 BasicMedia and MediaClipping Modules. 

Elements Attributes Content Model 

text, img, audio, video, animation, text stream, ref Core, IlBn, Description, Timing, Teat, ere, region, fill, alt, 

60 longdesc, type, clipBegin (clip-begin) . clipEndt clip -end) area (anchor) 



Elements of the BasicMedia Module have attributes describing basic timing properties of contents. For timing, the begin and end 
attributes should have one attribute value for a single timeline. 

14.4.6. Structure Module 

The Structure Module describes the structure of the SMIL document. The SMIL 2.0 Basic profile includes the SMIL 2.0 Structure 
Module. 

The structure element body is implicitly defined to be a seq time container as in the SNZL1.0 and SMIL 2.0 language profiles, and 
70 this is true in SMIL 2.0 Basic profile as well. 

Elements Attributes Content Model 

smil Core, IlBn ( head?, body? ) 

head Core. IlBn ( layout? | switch? ) 

75 body Core, IlBn ( Schedule | switch ] MediaContent | LinkAnchor )* 

14.4.7. Timing and Synchronization Modules 

The Timing and Synchronization Modules provide a framework for describing timing structure, timing control properties, and 
80 temporal relationships between elements. The SMIL 2.0 Basic profile includes the basic functionality of the SMIL 2.0 Timing and 
Synchronization Modules (smil Timing ModuleB) . It is based upon the smil 2.0 Basic inlineTiming, syncbaseTiming, Event Timing, and 
B as icTiwie Containers modules. 

esThe SYMM Working group desires feedback on the need to limit the complexity of SMIL Basic documents in order to ensure that 
85 they can be played on low power devices. Ohtil recently, we have been considering limiting the use of time containers to one top- 
level container, but this now appears both insufficient in limiting document complexity and too restrictive to authors. We would 
like feedback from implementers on what limitations on cormplexity are necessary. 

Time containers are basic units in synchronization defined in the SMIL Timing Modules and they group elements with 
90 synchronization relationships. In a SMIL Basic document with single time container, par and seq time container elements contain 
media object elements and should not have other time container elements nested inside them. 

Attributes of the SMIL Timing Modules apply to elements of the SMIL 2.0 BasicMedia Module. Basics of timing are described in the 
SMIL Timing Nodules. Control of element start/end time is available with the begin, dur, and end attributes. Por repeating 
95 elements, the deprecated repeat attribute, the repeatCount and repeatDur attributes are available. 

The begin and end attributes contain the off set -value, syn chase -value, smil-l.O-syncbase- value, and event-value attribute values 
as single conditions. The SMIL Basic profile includes the EventTining Nodule, which Includes "activateEvent" or "click". These 
events are mapped to "activate" user action to select a focused element in a SMIL Basic player. 

100 
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As for the SMIL, Timing Modules for the SMIL 2.0 Basic, there la only one attribute value in begin and end attributes, not a 
semicolon-separated list, in order to create a single flat timeline. 

Elements Attributes Content Model ^ 

par Core, IlBn, Description. Test. Timing, endsync, fill, region ( switch | MediaContent | Schedule | LinkAnchor 

)* 

seq Core. IlBn. Description. Test. Timing, fill, region ( switch ) Mediacontent | Schedule | LinkAnchor )♦ 

The default value of the endsync attribute is "last". The default value of the fill attribute depends on the element type as 
described in the SMIL Timing Modules. 

14.4.8. Content Control Modules 

The Content Control Module provides a framework for selecting content. The Smil 2.0 Basic profile includes the SMIL 2 .0 
Basic Con tent Control and SkipContentControl Modules. The skip-content attribute, which default value is "true", applies to SMIL 
2.0 elements that were not a part of SMIL 1.0, and to elements that it was applied to in SMIL 1.0, and to elements currently 
empty in smil 2.0 but could be made non-empty in the future, following the definition in the module. 

a SMIL Basic player must process the syntax of a switch element correctly even if the player could not evaluate the Test 
attribute in a media object element properly as the attribute implies. Test attributes that cannot be evaluated will default to 
false. 



Elements Attributes Content Model 

switch Core, IlBn ( ( LinkAnchor | MediaContent )* | Schedule* | layout* ) 

14.4.9. XML Namespace Dec 1 art ions 

XML Namespace declarations using the xmlns attribute are supported on all elements in the smil Basic language profile. 
14. S. SMIL 2.0 Basic Language Profile Document Type Definition (DTD) 
This section 1b normative. 

The SMIL 2.0 Basic language profile DTD is defined as a set of SMIL 2.0 modules integrated. 

<1-- SMIL Basic Document Module Moxhjle»-o«»«»»«s«a»»a»««>«»Buooa«o«aoc»M»»o» --> 
< l — file 1 siidlbasic-model-l.mod 

This is SMIL 2.0 Basic, a proper subset of SMIL 2.0. 
Copyright 2000 W3C (MIT, LNRIA, Kelo) , All Rights Reserved. 

This DTD module is identified by the PUBLIC and SYSTEM identifiers. 

PUBLIC " - / /W3C/ / ENTITI BS SMIL 2.0 Basic Document Model 1.0//EN" 
SYSTEM "smilbasic -model -1. rood' 

Author 1 Kenichi Kubota, Warner ten Kate, Jacco van Ossenbxuggen 

Revision: $id» smilbasic-model-l.raod, v 1.27 2000/09/21 03il0i02 kkubota Exp 3 



< I -- 

This file defines the SMIL 2.0 Basic Document Model. 

All attributes and content models are defined in the second half of this file. We first start with some utility definitions. 
These are mainly used to simplify the use of modules in the second part of the file. 

- - > 

< t — .....^ssnasHH. utili Body - Media «h.« 0 « B bw.«»«»«b«"<i»«iw«»<i --> 
«1 ENTITY % media- object "text] img| audio | video | animation | text stream |ref"> 

<•-- ........... util* Body - Contsnt Control — > 

<! entity % content- control "switch- > 

(!»B B BaaaB«H«)g»uBB Util : Body - Linking 
<1 ENTITY % BasicLink "a | anchor) area" > 
«1 ENTITY % link "%BasicLink; 

< |.. ■BBBS*aBBaBaBBEBBB««l>««aBBBBaBn>aBnBBB8BSCBSBaaaB8B«BBBBBBBBaBBBaBSO -— > 

< I BBBBasBsaiaMaiaajaaiaaBBaotraasaniBeoaeocoaBainsscBBaiaBBDBBODaBaesaBaanasaB -— > 
£ I _ _ BaBBBSsaa«aBaaaaBBaBBBBBBB«BaBaaeBBaB8aiHBBi>BBaBBaBaBflBB88aaaB*«Baa«aa - — > 

<l — 

The actual content model and attribute definitions fox each nodul enaction follow below. 
--> 

« I ENTITY % BasicInlineTiming. module "INCLUDED 

< l ENTITY % MinNoxTiming. module 8 INCLUDE" > 

< I ENTITY % SyncbaseTiming. module 0 include* > 

< I ENTITY % BvsntTiming. module ■ INCLUDE" » 

< 1 ENTITY % BasicTimeContainers .module «INCLUDE"> 

<! ItBasicInlineTining. module; I 

< I ENTITY % baaicTimeContainero "part seq" > 

« I ENTITY % timeContainer "%baoicTin»Containers/"> 

< I ENTITY % basicTimeCcntainers. content 

«(%media-object; | *con tent -control ; |%link; }%basicTimeContainers;)»"> 

< i ENTITY % timecontainer. content •%basicTiroeContainerB.content;"> 

< I ENTITY % basioInlineTiming.attrib "%BaBicInlineTiming .attrib; B > 

< I ENTITY % basicTiroeContainers. attrib » *BaoicTimeContainera .attrib.- •> 

n» 

« I ENTITY % timeContainer ■•> 

< I ENTITY % timeContainer .content ""> 

<! ENTITY % basicIalineTindng «■*» 

<t ENTITY % ba s icTime containers . act rib ""> 

<! \ *MinMaxTiming . module; ( 

< I ENTITY % minMaxTiroing. attrib °%MinMaxTiming .attrib/ ■> 

< I ENTITY % minMaxTiming. attrib ■■> 
< I ENTITY » smil- time. attrib " 
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I bao i c Xnl InoTiaing .attrib/ 
%min*iaxTljnlng . at crib; 
%Pill. attrib; 



< t ENTITY % timaCo n tai n er ■ attrib 
%DasicInlineTiaing.attrib; 
%basicTime£ontainers . at trib ; 
kminMaxTlraing . a ttr ib ; 



<» ENTITY % par. content: "ttimeCoatalner. content .•■> 
< I ENTITY % oeq. content "%timeOontainer. content; ■» 

*! ENTITY % par.attrib ■ 
% time Container . at t rib; 
* System. att rib ; 

"> 

clESTlTY % seq. attrib • 
% timaContainer . att rib / 
kSystem. attrib; 

"> 

<l — ......»...•»••>.•■ Content Control ««t-«»«»««oao«B»B, 

< J ENTITY % Basi eContent Control .module -INCLUDED 
< J ENTITY % SkipContentControl. module ■INCLUDED 

< I ENTITY % switch. conteat 
• ( {%media- object;) %link; ) * | (%timeContainer ; ) * | layout) •> 

< I -- BoaaamHaBHnaooB Layout «»aaanaBaoaooDoaooacoo»««« 

< I ENTITY % BaeicLayout .module «include»> 

< (ENTITY % layout, content » < root- layout?, (region) *) ■> 
< I ENTITY * region . content " EMPTY" > 

< I BKTITY % roo t layout . content " EMPTY "> 

< 1 ENTITY % region. at trib " %skipContent . attrib ;" > 

< l ENTITY % root layout. at trib •%8kipContent .attrib; •> 



<. 1 KHTITY % LinkingAt tributes. nmdule "INCLUDE"* 

< I ENTITY * BasicLinking. module " INCLUDE" > 

« I ENTITY % a. content ■ (%medio-object; ) *«> 

< 1 ENTITY % area. content "EMPTY" > 

< I ENTITY % anchor .content "EMPTY* > 

<} ENTITY % a. attrib "%smil -tine. attrib; "> 

<1 ENTITY % area. attrib "%smil- tine. attrib; %skipContent. attrib; "> 

«> ENTITY % anchor. attrib -%smil- tine. attrib; tskipContent .attrib; »> 



<l— bbobbbbbbb... — .« = Media 

< I ENTITY % BasicNedia. module «INCLUDB"> 

< I ENTITY % MediaClipping. module ■ INCLUDE" > 

< I ENTITY * media-object. content ■ (area| 
< I ENTITY % media- object. at trib » 

% smil - 1 Ima . a t trib ; 

♦System. at trib; 

* Region .attrib; 



< I ENTITY % smil. content " (head?, body?) «■» 

< t ENTITY % he ad- layout .content "layout ] switch"* 
< 1 ENTITY % head. content " ( %head- layout . content ;) T"> 

< 1 ENTITY % body. content ■ (%titne Container; |% media- object; | 

♦content-control; |%link;)*"> 

< I - - .-——-—.-■.*««...-. End of smilbasic-model-l.mod' ar«B«B 



<!-- ........ —> 

<!-- SMIL 2.0 Basic DTD .... ...««.».««=»BB.ar»s«»».a«B«««e«-*«o»..«««-««- mm --> 

-I— file* SMIUOBasic.dtd 

This is smil 2.0 Basic, a proper subset of SMIL 2.0. 

Copyright 2000 Nor Id Hide Web Consortium (Massachusetts institute of Technology, instltut National de Recherche en Informatlque 
et en Automatiqua, Keio Univercity, All Rights Reserved. 

Permission to use, copy, modify and distribute the SMIL Basic DTD and its accompanying documentation for any purpose and without 
fee is hereby granted in perpetuity, provided that the above copyright notice and this paragraph appear in all copies. The 
copyright holders make 

no representation about the suitability of the DTD for any purpose. 

It is provided "as ia" without expressed or implied warranty. 
Author* J a ceo van ossenbruggen, Kenlchi Kubota 

Revision* $Idt SMIL20Basic. dtd. v 1.3 2000/09/21 11. 16 i 46 jvanoss Bxp $ 
— > 

< I— Thie is the driver £ilc for the GMTX Basic DTD. 

Please use this formal public identifier to Identify it* 
"-//W3C//DTD SMIL 2.0 Basic/ /EN" 



< I ENTITY * N8. prefixed "IGNORE" » 



-21- 



WO 03/021416 



PCT/US02/27820 



<! ENTITY % SHU*. prefix " > 

Define the Content Model --> 
■clEHTITT % mail -model .mod 

PUBLIC "-//W3C//BNTITIES SMIL. 2.0 Basic Document Model 1.0//EH- 
■cmilbaoio- model- 1 -rood' > 

<l— Modular Framework Module "* 

<1 SOTITT % soil -framework .nodule ■ mCLUDS" > 
«) I %emil-f ranework. module; f 
«:] ENTITY % sail- framework .mod 

PUBLIC "-//II3C//BOTITIES SMIL 2.0 Modular Framework 1.0//BN- 
■stftil-f ramework-l.mod* > 
temll- framework .mod/] ] > 

<| -- The SMIL 2.0 Basic Profile support o the lightweight multimedia 
features defined in SMIL language. This profile includes 
the following SMIL modules* 

ENIL 2.0 BasicLayout Module 

SMIL 2.0 Basidi Inking Module 

SMIL 2.0 BasicMedia and MediaClipping Modules 

SMIL 2.0 Structure Module 

SMIL 2.0 BasicInlinTiming, SyncbaseTiming, EventTioiog, 

MinMaxTiming and BaaicTiaeCantainera Modules 
SMIL 2.0 Baa i eContent Control and SkipContent Control Modules 



< J entity % layout -mod , 

PUBLIC "-//W3C//BLEMENTS SMIL 2.0 Layout //EN" 

"SMIL-layout .mod"* 
% layout -mod; 

< I ENTITY % link-mod 

PUBLIC "-//W3C//BLBMENTS SMIL 2.0 Linking//EN" 

°smil- link .mod 11 » 
% link -mod; 

<! ENTITY % BaaicLinkingModule "INCLUD3"> 

< I ENTITY % media-mod 

PUBLIC »-//W3C/ /ELEMENTS SMIL 2.0 Media Ohjects//EN» 

" SMIL-media, rood "> 
%media-mod; 

< I ENTITY % struct -rood 

PUBLIC ■ - / /W3 C/ / elements SMIL 2.0 Document Struct ure//BN» 

"SMlL-struct .mod"> 
t struct -mod; 

< I ENTITY % timing-mod 

PUBLIC »-//W3C// ELEMENTS SMIL 2.0 Timing/ /EN" 

n SM IL- timing . mod » > 
% timing-mod; 

< ! ENTITY % control -mod 

PUBLIC *-//W3C/ /ELEMENTS SMIL 2.0 Content//BN" 

*SMIL-control.mod»> 
% control-mod; 
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1 . Introduction 

1.1. Purpose 

30 The purpose of this document is to establish functional requirements and performance specifications for the ODA Studio 

1.2. Audience 

The intended audience for this document is Command Audio's management, operations* engineering, and software programming staff. 
This document will also serve as a basis of agreement between these parties as to the system's functionality and performance 
levels 
35 1.3. Scope 

The scope of this document will include constraints, functional requirements, and performance specifications. . 
1.4. Definitions, Acronyms, and Abbreviations 

Audio -On- Demand Service A service that enables listeners to choose the programming they want to hear, whenever they want 

and wherever they are. 
40 Block Indicates the type of compression in on audio layer 

CAML Command Audio Markup Language (a derivative of XML) 

own, object 

CAML element 1 The basic program building block in the ODA Studio Composer. An element can be a segment, layer, or block. 

45 CAML header element CAML header elements allow introductory details of a program to be placed separately from the main story 
caul detail element t CAML detail elements allow additional content relating to header elements to be placed in this 

area . 

Compound Segment A segment which contains more than one segment 
Content Material fortrans mission to Command Audio-enabled devices. 
50 Layer Represents multiple types of media, such as audio, text, image or video 
Program Playable content made up of elements 

Prime Segment Smallest navigational unit, made up of one or more layers 

CA Command Audio Corporation 

55 ODA studio on-demand Authoring studio 

2. System Overview 
2.1. Explanation 

The On-Demand Authoring (ODA) Studio is an editing software responsible for creating On- Demand Interactive Audio. Individuals use 
ODA to produce programs in a way that enables listeners to choose their programming and the specific stories within those 
60 programs which interest them. 

The ODA Studio is created with the express intention of creating these navigable programs. 

Single level and two-level programs can be created in the ODA studio. Single level programs provide fewer navigation 
65 opportunities than two- level programs. Two- level programs may include headers t the introductions to various stories within the 
program, and details 1 che specific stories which relate to the headers . 

3. Design Considerations 

3.1. Assumptions and Dependencies 

70 ? The ODA Studio runs on the windows 2000 Operating System 

7 The ODA Studio should require minimal modifications for each system in which it is installed 

? End users will include command Audio Production staff, as well as Production staff or Broadcasters at Command Audio -partnered 
companies. End users will have the capability to put together programs from multimedia sources for transmission to Command 
Audio-enabled devices. 

75 ? The ODA Studio is not designed as a comprehensive replacement for other editing programs on the market, but instead is 
developed solely as a vehicle for creating Command Audio Content. 

3.2. Related Software or Hardware 

3.3. General Constraints 

? For optimum editing, wave file size should not exceed xxHb 
80 7 Piles in Content Explorer should be protected so that only one user may change contents at any one time. 

3.4. Goals and Guidelines 

? Should easily adapt to other partners' software 

3.5. Development Methods 
7 Written in Cm- 

85 7 Once it was decided that the ODA Studio would operate on a Windows platform, Active X controls were chosen to distinguish the 
functional capabilities of each part of the studio. 



90 



95 



100 



4. Architectural Strategies 
4.1. System Architecture 

The ODA Studio was conceived as a replacement for the current Command Audio editing system, ENCO. One of the main benefits of 
the ODA Studio is the flexibility it offers. The ODA studio includes 4 comptmaatsi 

7 Content Explorer displays multimedia files which are used to create a program 

7 Composer provides a visual stage to compose content. The Composer is made up of a Navigation View, Compression View, and 
Multimedia View. 

7 Prompter text content can be created and existing text files can be opened. 

7 Waveform Editor is the place where audio files are edited and segmented into On Demand interactive Audio. 
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? segment audio programs specifically for Command Audio-enabled receivers. 

5 

4.2. Programs which can be created in the ODA Studio 

4.2.1. Type one* 

A program which requires minimal processing, may only need to be raarfced for compression. It is comprised of one prime segment 
only but contains at least one media, layer. For example, the prime segment could contain a music track, associated textual 
10 information about the singer and writer, and an image of the singer. There are no navigational elements in a prime segment. 

4.2.2. Type two* 

A single level program which can be linked in such a way that a listener can navigate between different segments. Bach of the 
segments is a prime segment capable of having several media layers. The navigation is sequential. Por example, a listener can 
listen to story 1, skip stories 3 and 3, listen to story 4, and go back to hear story 2 by scanning back through 3. 
15 4.2.3. Type three* 

A two level program which allows additional navigation options. The program consists oe a sequence of headers with associated 
detail segments. The user can now navigate from header to header or can navigate from a header to the associated detail. Instead 
of only being able to navigate between segments, a listener can navigate story headlines (headers) before deciding if they are 
interested in hearing the details behind the headlines. These programs allow a listener to navigate back and forth through 
20 headers and details in the program segments 
4.2.4. Type four. 

The fourth type of program is a type three program offering the user even more navigational options. The user can essentially 
navigate even within a header or a detail segment. The segments are marked in such a way that the listener can navigate within 
the headers and details. If there are two separate thoughts in the headline, these might be marked in such a way that the 
25 listener could skip, forward, rewind within headers. The segments are marked in such a way that the listener can hear/choose 
specific portions of the details as well. 
4.3. ODA Studio Container and ActiveX Controls 
The ODA Studio Container hosts 4 ActiveX controls* 

30 ? Content Explorer Control displays the content required to build the output multimedia CAML objects, and provides the ability to 
drag end drop these objects into other parts of the ODA Studio. Examples of this content includes audio programs, earcons, texts, 
images etc. 

? Prompter Control is the ActiveX component that allows text related to a apecif ic audio recording to be created or edited 
? Waveform Editor Control is responsible for multiple functions of editing audio files (wave files) 
35 7 Composer Control provides functionality for composing a CAML multimedia object from multiple source components. 
4.3.1. waveform Editor 

Presently, Waveform Editor Control is implemented within ATL framework. It utilises OLE Drag And Drop for exchanging wave files 
with other components. It uses DirectSound API for accessing the system's audio resources. NFC is used for rendering waveforms 
and GUI. 

40 

? Ability to have two playback modes, in Mode l, cursor moves across the waveform in sync as the waveform scrolls forward. In 
Mode 2, waveform view does not scroll forward when cursor reaches right edge, but playback continues. 

4.3.1.1. Recording 

? Ability to indicate recording is in process by illuminating the record button and peek meter 
45 ? Ability no pause and resume recording by selecting pause or play. 

? Ability to save wave file by using windows keyboard shortcuts or file menu 

4.3.1.2. Basic Editing 

7 Ability to rewind, play, stop, pause, forward, zoom in and zoom out using buttons on the toolbar 

? Ability to undo last action (actions to be defined later) using a button on the toolbar or a keyboard shortcut like Ctrl Z 
50 4.3.1.3. Scrolling and Selecting 

? Ability to scroll left and scroll right using arrows on the keyboard or the mouse and scrollbar 

? Ability to zoom the wave into at least 100th of a second, and eoora out to view the entire waveform using the zoom in/ out 
buttons on toolbar. 

? Ability to select sections of the wave using the mouse and Ctrl key. 
55 ? Ability to adjust the selection by single pixels from the left or right border using keyboard arrows and the shift key 

4.3.1.4. Segmentation 

7 Ability to insert/delete markers and compression blocks into imported waves by right clicking with the mouse 
7 Ability to keep markers consistent even if audio between them has been removed or changed. 

4.3.1.5. Copying and Pasting 

60 7 Ability to copy and paste wave sections from the Source (recording) pane to the Target pane and within the Target Pane. 

7 Ability to replace selected portion of waveform with data from the internal clipboard at the insertion point by using Windows 
shortcuts . 

4.3.1.6. Playback 

7 Ability to begin playback of a current selection or of the entire waveform from either the left edge of the waveform display or 
65 from the cursor position. 



70 



4.3.2. Content Explorer 

Content Explorer is based on a MFC CTreeView class that simplifies the usage of the txee control CTreeCtrl, the class that 
encapsulates tree -control functionality. 



Content Explorer is a MFC ActiveX control. Its two objectives* 
? To display (in tree/branches format) media sources available for building the output multimedia CAML objects. These sources can 
be audio programs, texts, images, commercials, etc. These building blocks can be files or records in databases. 
7 To initiate a dragging mechanism when moving or copying tree items to other parte of application. 
75 4.3.3. Prompter 

The Prompter ActiveX control provides text content for CAML objects. Prompter is a MFC ActiveX control based on a CRichBditView 
class. CRichEditView provides the functionality of the rich edit control within the context of MFC's document view architecture. 
CRichBditView maintains the text and formatting characteristic of text. 
4.3.3.1. Basic Features 

80 7 Ability to flag parts of text for re-evaluation later in editing process. A keyboard symbol and shortcut needs to be assigned 
to this feature so that PS can easily doublecheck and review work. Flag will remain in prompter as long as any of these remain 
in txt. PS can place notes specific to file as well. 

7 Ability to differentiate comment text from script text using separate color and font and brackets. Announcer needs comment text 
to indicate pronunciations and emphasized words in script. 
85 7 Ability to create headers and details which pertain to waveform recording by pressing Control H and Control D. 
7 Ability to adjust font size and type using Windows keyboard shortcuts 
7 Ability to open content from content explorer and outside applications. 

7 Ability to save content which has been created in prompter, ae well as content which has been imported and changed in prompter 
using Windows keyboard shortcuts. 
90 7 Ability to scroll down through text using up end down arrows on the keyboord or a scrollbar 

4.3.4. Composer 

Composer control provides functionality for assembling a CAML multimedia object from multiple source components. Composer control 
interacts with all other ActiveX controls. Its GUI depicts a CAML multimedia object composed of CAML elements . Currently a CAML 
95 element from GUI point of view is shown as a rectangle, and can be a prime segment or a compound segment. 
4.3.4.1. Basic Peatures 

? segments can be inserted, deleted and or appended using the buttons on the toolbar. 



100 



7 When the number of segments reaches the edge of the Composer View, you should be able to scroll forward through the view. 
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Segoen-e can be viewed in any order, moving bade and forth or up and down between headers and details by using the arrow keys cm 
the keyboard 



5. Subsystem Architecture 
S.l. Composer Component 

A new gui design is implemented for 1.2 release. Kew features include t 
5.1.1. Single Level program presentation 

Before creating a program, choose the type of program you wish to create. They can select a single level or a two- level program 
from an options menu, and the appropriate screen will appear. To create programs which do not require complex navigation 
capabilities, choose a single-layer program. For more complex programs, two- level programs should be created. These programs 
will be segmented so that listeners will still be able to navigate between stories. 
5.1.3. navigation view 

Piles can be dr a gged and dropped, Inserted, deleted, attached and appended assembled in the Navigation view. Because they are 
parts of the overall program composition, segments, layers and blocks will all be managed within the navigation view. 

5.1.3. Multimedia layer view 

You can select individual multimedia layers in the multimedia viewer. A preview pane is available to examine text, image, or 
video layers. A wave layer can still be viewed in the Waveform Editor. You can right click on a segment to view the properties 
of that segment. Properties can include i 
7 rile nana 
7 Pile sire 
? Pile length 

t Date file was originally recorded 

7 Name of person who last worked on file and when 

? Source of the original recording (satellite from , I SDH from , etc.) 

7 Sample rate at which it was recorded. 

5.1.4. Compression View 

The compression view displays the blocks which represent compression in a wave file. The compression view works with all parts 
of the multimedia viewer to illustrate the way the different layers fit together in a program. 

6. Skin capabilities 

6.1. User Interface images (bitmaps) requirements for ODA Studio 

The targeted resolution for the oda studio is 1024 x 768 pixels. The composer on the upper right hand side of the studio should 
measure 600 x 330 pixels. The user will be able to choose the type of skin from a list of available skins. The properties of the 
skin can be applicable to one component or to all of them. Skin elements must be able to be stretched without distorting original 
image. 

6.1.1. Requirements for the Composer » 
1. A background image - CoraBk.bmp 

6.1.1.1. For the Prime segments 

7 Prime segment empty selected - PriSegBS.bmp 

? Prima segment empty una elected- PriSegEUs.bmp 

7 Prime segment full selected - PriSegPUs .bmp 

? Prime segment full unselected - PriSegPUs.bmp. 

? Detail: Prime segment empty selected - PriSegBS (det) -bmp 

? Detail t Prime segment empty unselected- PriSegSUs(det) .bmp 

7 Detail: Prime segment full salectad - PriSegPUs (det) .bmp 

7 Detail: Prime segment full unselected - PriSegPUs (det) .bmp 

6.1.1.2. For the Compound Segment ■ 

7 A bitmap image that looks good when stretched. 

? Compound segment selected - ComSegS .bmp (should this be ComSegES.bmp) 

7 Compound segment unselected- ComSegUs.bmp 

7 Compound segment full selected - ComSegPS.bmp 

7 Compound segment full unselected - ComSegFU.bmp 

? Detail: Compound segment selected - ComSegS (det) .bmp (ComSegBS.bmp) 

? Detail t Compound segment unselected- ComSegUs(det) .bmp 

7 Detail; Compound segment full selected - ComSegPS (det) .bmp 

? Detail: Compound segment full unselected - ComSegFUfdet} .bmp 

7 Prime segment empty selected - PriSegBS.bmp 

7 Prime segment empty unselected- Prisegsrus.bmp 

7 Prime segment full selected - Pri8egEUs.bmp 

7 Prims segment full unselected - PriSegPUs.bmp 

6.1.1.3. ToolBar bitmap with an image for the following functions » (ComToolBar.bmp) 
7 Append 

7 Attach File 
? Delete 
7 Insert Before 
7 Insert After 
7 Assemble 
7 Player 

6.1.1.4. Icons for Multimedia layer view. 
7 icons need to be 16 X 16 pixels in size. 
7 One background image- ComLayBk 

? Tor Segment (This is in place of a sign in the tree view Df the layer view. The tree view shows the layers in the segment) 

? Opened segment - ComLaySeg0.bmp 
? Closed Segment- ComLaySegC.bmp 

7 Por wave file- ComLayWav.bmp 
7 For Text File - ConuUayTxt.bmp 
7 Por Image file - ComLayImg.bmp 
7 Por Video Pile - ComLayVid.bmp 
7 For Animation - ComLayAni.bmp 
7 Por Compression vlewi 
? For background - CoraCorapBk.bmp 

7 To represent different compressions like DVSI and DOLBY in the compression view we need two more bitmaps which will look good 
when resized i 

? To represent DVSI - ComCampDV5I.bmp 
? To represent DOLBY - ComCorepDOL.bmp 

6.2. Requirements for the Content Explorer « 
Background image - ConBk.bmp 

7 Icon for the main repository 

7 icons need to be 16 x 16 pixels in size. 

7 The following subf older icons (He can either have a separate icon for each folder or a uniform icon for all folders) . 

? If we have only one icon for all folders* 
7 Con Pol C.bmp 
7 Con Pol O.bmp 

? If we have separate icons for each type of folder* 
7 Audio Clips - CanPolAud.bmp 
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T Bar cans - ConFolEar . trap 

? Music samples* CcnFolMus.bmp 

? Prompt files- ConFolPro.bmp 

? Video files - ConFolVid.bmp 

? Text Files- CoaFolTxt.bmp 

? Image Piles- ConPolimg.bmp 

? Wav» Files— ConWav.brap 

? Text Files-Con Txt.bmp 

? Music Files-Con Hus.bmp 

? Animation Files-Con Avi.bmp 

? Image Files-Con Irag.bnp 

The icons created for Multimedia layer view can be used to denote each file in the Content Explorer also. 
6.2.1. Requirements for Wave Editor * 

Bitmap for toolbar of target pane - WavToolbarTrg.bmp, with the following buttons 
? Rewind 
? Play 
? Stop • 
7 Pause 

T Fast forward 
? Zoom in 
T Zoom Out 
? Scroll left 
? Scroll right 

Bitmap for toolbar of targot pans - WavToolbarSrc.bmp, with the following but tone 

? Rewind 

? Play 

T Stop 

? Pause 

? Record 

7 Fast forward 

? Zoom in 

7 Zoom Out 

? Scroll left 

? Scroll right 

Bitmap for navigating handle - WavNavHdl . brap 

7. Detailed System Design 

7.1. Basic design properties 

? The Container communicates with a control using a set of COM interfaces. Two-way communication is performed through connection 
points , advise sinks and other interfaces. 

? The ODA Studio Container has ambient properties, which allow the container to provide information to its controls. This ensures 
a uniform look and feel for all the Controls visible inside the Container. The Controls are supposed to react to the change of 
the ambient properties and synchronize the appearance and behavior with the other controls accordingly. 

7 The ODA Studio Controls have properties which affect the behavior of a control. They can also be used to pass data to and from 
the control. 

? The ODA Studio Controls also support stock properties, which are standard properties like Font and Color. Standard properties 
are also known to the Container and therefore can be bound. Specifically, Control can inform Container that a stock property is 
about to change and allows the Container to reject the change. 

? Control's Properties are persistent so that information is retained when control is closed. When control is restarted, it will 
default to the settings UBed when last closed. 

? The ODA Studio Controls are required to respond to ambient property changes 

7.2. Wove form Editor 

7.2.1. Interfaces 

Currently Waveforn Editor Control exposes one interface - 
iwavesditobj with five methods t 

SetwavePath(BSTR bstrPath) * when a wave file is selected, the variable bstrPath passes the name of the wave file to the 
Waveform Editor. 

Play() t Commands Waveform Editor to play loaded wave file. 

StopPlayOi Commands waveform Editor to stop playing loaded wave file. 

Get Segment Count (long * pnCount) t Returns the number of segments in loaded wave file in variable pnCount. 

GetSegmentPathAtdong index. BSTR * bstrPath). Creates wave file for the segment indicated by index and returns its path in 
bstrPath. 

7.2.2. Ambient Property Handling 

On startup the control calls CComControl' s methods: 

CetAmbientBackcolor, GetAmbientPoreColor,GetAmbientFont to set up stock properties, 
when an ambient property changes, the Container calls 

lOLEControlt tOnAmbientPropertyChangeO 
on Waveform Editor Control passing DISPID of the property that has changed. The control overrides this method, obtains property 
value and handles the change. 

7.2.3. Custom Properties 

For setting up custom properties a property page is required. 
Initialising properties 
Persisting properties 

7.2.4. Drawing 

Synchronous scrolling is implemented to display a playing waveform. 
Current rendering rate is 10 panes per second. 

7.2.5. user interaction 

7.2.5.1. Target/ Source Panes 

Waveform Editor consists of two editing panes. In the target pane, extra material from the recorded wave can be removed. It for 
can also be marked for scanning and compression purposes. In the lower (source) pane, sound can be recorded through a mixer for 
inclusion into the wave file in the target pane. 

Bither pane can he accessed by clicking on the left mouse button. Wave files from other applications may be dropped into the 
target pane. The target pane can also be a drag source. Drag and drop 1b implemented using standard OLE drag and drop mechanism. 

The source pane has the same functionality as the target pane, and also allows for Bound recording through a mixer (there is a 
* record' button in play control) . Selections can be dragged outside of the control. Selections can also be dragged to the target 
pane for inclusion into an existing wave file loaded there (not implemented yet) . 

The target and the source panes are separated by a splitter, which allows the pane windows to be re- sized. 

7.2.5.2. Editing Tools 

Waveform Editor has five control buttons: "go to start*, "play*, "pause" (not implemented) , •stop", "go to end*. NB: "record* 
button is absent in the target pane. 

The target pane also has four control buttons: "roam in", "room out", "scroll back", and "scroll forward". Scrolling is also 
possible with a horizontal scroll bar at the top of the pane. 
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The pane has horizontal rulers to measure the playback tine up to a 100th of a second. The rule in change scale with window 
real elng and scroll in sync with playback. 
7.2.5.3. Setting Markers 

A navigation handle indicates the current position in the wave file. Handle can be moved beck or forward with the mouse. Sound 
will be played when handle is moved using mouse (scrubbing) . 

Markers can be sec to define compression blocks. Markers have the following attributes s compression (Dolby, dvsi. Skip) - The 
marker attributes are indicated in a dialog window. Segment markers have labels to indicate compression choice. Markers can 
subsequently be removed by double clicking on segment mar Iter and pressing • Remove marker* button. 

Portions of the wove file can be selected by holding the ctrl key down while dragging the left mouse button. Selections are 
draggable outside oE the control. 

The Context menu appears when right mouse button is clicked. Menu options include setting segment marker at current mouse 
position, selection cut, copy and paste (not implemented) . 

7.3. content explorer Control 

7.3.1. Drag and Drop Mechanism 

Content Explorer implements a MFC OLE drag-and-drop mechanism, when a user transfers data, using either the Clipboard or drag and 
drop, the data has a source and a destination. Content Explorer provides the data for copying and another part of application (it 
can be Composer ox Waveform Editor) accepts it for pasting. The MFC provides two classes that represent each side of this 
transfer i 

Data source (as implemented by COleDataSource object) represents the source aide of the data transfer. It is created by the 
source Content Explorer when data is to be copied to the Clipboard, or when data is provided for a drag-and-drop operation. 

Data object (as implemented by coleDataObject object) represents the destination side of the data transfer. It is created when 
the destination ActiveX has data dropped into it, or when it is asked to per torn a paste operation from the Clipboard. 

Currently a well-known CF_HDKOP format is used for dragging and dropping files. 

7.3.2. User Interaction 

Content Explorer displays all folders and files that reside in the directory called Repository. A user can drag a tree item and 
drop it onto a layer, segment, tree view or waveform editor. Text files can be dragged file into Prompter. 

7.4. Prompter Control 

7.4.1. Interfaces 

The Prompter ActiveX control has just one _DPrompter interface. 

To get a pointer to this interface* 
CComPtr<_DPrompter> pObj; 

pPrompterCtrl->Query Interface ( unidof (JDPrompter) , (LPVOID *) &pObj ) r 
pQb j - >FooMethod ( } j 

Here pPrompterCtrl of type lpoleobjecT is a pointer to the Prompter ActiveX control . 

The JQPrompter interface has the following methods* 

void OnFileOpenText ( ) • 

void OnFileSavaTextO r 

void OnPileSaveTextAs () # 

BSTO Get Document Pa th() / 

BSTR Get Con tent ( ) ; 

BOOL IsModifiedO; 

7.4.2. User Interaction 

A User can do the following things* 

7 Open the text file from File menu and select the *Open Prompter Text" menu item. 
7 OR, place cursor in Prompter and begin composing text. 
? Edit the text. 

? Text can be divided into headers and details. To insert a header marker, a user must press Ctrl and H keys at the same time. To 
insert a detail marker, a user must press Ctrl and D keys at the same time. Prompter automatically numbers headers and details 
sequentially. For example, i£ the last header is «Header> 5, pressing Ctrl and H keys puts <Hoader> 6 into the cursor position. 
7 Save the text UBing one of the following* Save As, Save Prompter Text, or Save Prompter Text As, from the File menu. 

7.5. Composer 

7.5.1. Interfaces 

Currently the Composer ActiveX control has just one jaComposer interface. The following code shows how to get this interface i 
CComPtr< v DComposer> pObj / 

pComposerCtrl->Query interface ( uuidof (_DCoroposer) » (LPVOID * ) &pObj ) ; 
pOb j - >PooMethod < ) t 

Here pComposerCtrl is a pointer of class LFOLEOBJBCT to the Composer ActiveX control. 
Currently this interface has just one method t 

8etPrompter( (I^UTIKNOWN*) &(pProrcpter) ) . The Composer ActiveX control has a member of class ColeClientltcm* m_pProrapter . This 
method sets rajpPrompter to the Prompter ActiveX control pointer. 

7.5.2. User Interaction 

Composer has the following buttons* 

Append. Appends new empty header and detail elements to the CAMI, object. 

Attach. Opens a file dialog box. A user can choose a file and attach it to the selected element. 

Delete. Deletes a selected element. In a two- level program, if a header is selected, its correspo n d ing detail element will 
automatically be deleted as well. 

insert before. Inserts segment before the selected element, in a two- level program, an upper and lower element will be 
inserted. 

Insert after. Inserts segment after the selected element. In a two-level program, an target and lover element will be inserted. 
Play next. Plays next header element 
Play previous. Plays previous header. 

Play down. Plays the header's corresponding detail element when a header element is selected 
Play up. Plays the detail's corresponding header when a detail element is selected. 
Stop. Stops playing a selected element. 

Assemble. Attaches wave segments from the source pane of the Waveform Editor control to the CAMI, elements according to the 
prompter text layout. The first segment is assigned to the first header/detail in the prompter text, and so on. 

In addition, a uaer can do the following things* 

? Drag a wave file from the Content Explorer control or Microsoft windows Explorer and drop it on a CAML element or multimedia 
layer in the Composer control. 

7 Drag a selected wave segment from the first pane of the Waveform Edit control and drop it on a CAML element. 
7 Move a file attached to a CAMI, element to another element. 

? Append a file attached to a caml element to a file attached to another element. To do it a user has to press a Ctrl key and 
move a file at the same time. 

8. Detailed Subsystem Design-D.A.V.I.D Systems 

8.1. Overview 

The On-Demand Authoring Center's Composer ActiveX (control) interacts with the Multi track Editor (client). The Multitrack Editor 
is a client of the Composer. Communication between the two is bi-directional* the client controls the Composer and vice versa. 

8.2. Composer ActiveX 
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The Composer iaplecento two kinds of interfaces i incoming and outgoing. The incoming interface contains methods that are called 
by the client. The outgoing interface is used to notify the client of the control's events. It la called a sink interface, 
because it sinks event notifications. 

8.3. incoming Interface 
8.3.1. Description 

Control's inconing interface is called DComposer and is created by the MFC ActiveX Control Wizard. The current DComposer 
interface has the following methods * ~ 

- void BSTR GetBlocks (bstr bstxob j ect id ) 

- void long QnDelete (BSTR bstrObjectID) ; 

- void long OnDrogDrop (BSTR bstrPormat, long X, long Y, short mode, BSTR bstrObject) j 

- void OnPileNew {) ; 
-void BSTR OnGetMetadata () j 

- void long onsetMetadata (BSTR bstrComposerttetadata) j 
-void long set Proper ties (BSTR betrSetProperties) , 

The GetBlocks (BSTR bstrOhj ect ID) method returns an XML string containing information about blocks in the object with a given 
object ID. 

The OnDeleto (BSTR bstrObjectID) method is called by the client when an object is being deleted. 

Clips from the client multi-format clipboard or any track can be moved to a composer element (header or detail prime segment) . 
The OnDragDrop U) method is repeatedly called by the client during drag-and-drop operations. The method has the following 
parameters t 

- bstr Format represents a drag format which is an XML string with the following structure t 

- < ob j ect info 

<ob j ect id>obj ec tid< /ob j ec t id> 
*dur> duration in US</dur> 

- </objectin£o> 

X and Y are the current mouse coordinates, using the coordinates, the Composer can determine whether the mouse is inside the 
particular Composer element; 

- mode indicates whether drag or drop operation occurs; 

- bstrobject is a string representation of the object id of the dragged object. 
The OnPileNew ( ) method is called to create a new project in the Composer . 

The Composer Metadata presents the structure and content of the program loaded in the Composer. The client calls the 
OnGetMetadata () method to get the Composer Metadata. The return value is a string representation of the Composer Metadata. So, 
the serialised project includes this string. The Metadata string is an XML string. 

The OnSetMetadata (BSTR bstrComposerMetadata) method is called by the client upon loading the project. The Composer, using 
bstrCompoaerMetadatn, can restore the previously saved program. BstrComposer Metadata is an XML string. 

SetProperties (BSTR bstrSet Proper ties) is used for previewing layers selected in Multimedia Layer View. The parameter 
bstr Proper ties is an XML string with the following structure: 

<objectinf o> 

<objectid> </objectid> 
<dur> </dur> 
<markerinfo> 

«marker> 

<poa> </pos> * 
<title> «/title> 
<type> «/type» 

</marker> 
</marker inf o > 
< /object info > 

8.4. Outgoing Interface 
8.4.1. Description 

Control' 5 outgoing interface is called _Dcomposer Events and is created by the MFC ActiveX Control Wizard. The currant 
__DComposerBvents interface has the following events i 

- void Edit Object (long object ID, long* result) 

- void Get Properties (BSTR bstr PropOescr) 

- void PlayObject (long objectTD, long*result) 

- void 3 top PlayObject (long object ID, long* result) 

- void SaveCurrentState(long*result) 

- void SaveObject (long object ID, BSTR path, long* result) 

- void SaveSource (BSTR SaveSour cePa th , BSTR* bstrSourceKame) 

- void Get Pile (BSTR xmlObj ect Info, long* re suit) 

The client (not the control) implements all these events. The Composer can only fire the events. For example, to fire SaveDbject 
event the Composer executes the following code* 

PireEvent (dispidSaveOb j ect, EVENT_PA2AM ( VTS_I4 VTS_WBSTR) , object ID, path). FireBvent through the COM mechanism passes 
control to the client where the corresponding method is called. 

The bstrPropDescr parameter of the GetPropertiea event is an XML string with the structure i 

<objectinfo> 

<objectid> </objectid> 
</objectinfo> 

The GetFile parameter is an xml string with the structure! 

<bb j ect inf o> 

<objectid> </objectid> 

<path> </patb> 

<format> </format> 
</ object info 

8.5. Client 
8.5.1. Description 

The client does the following things t 

7 implements events that the Composer fires. To do this a class of type CAdviseSink that derives from DcomposerEventB has been 
created 

? launches the Composer ActiveX that allows the client to create the embedded OLE item of type CDavidClientCntrltem that derives 
from COleClientltem. To do this the client needs to know just the Composer ProgID "COMPOSBR.ComposerCtrl .1- 
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T defines the Composer position in the client azea 
? get a a Composer pointer of type I Connect ionPoint Container 
5 ? using the last interface finds the IConnectionPoint interface 

? creates an object of type CAdviseSink 

? through IConnectionPoint interface calls Advise method that passes the pointer to the CAdviseSink object 
10 9. Glossary 

10. Bibliography 

1. Functional requirements 
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We Claim: 

L A method of editing audio or video data using a display, comprising the acts of: 
providing a plurality of segments of the audio or video data; 
displaying an icon representing each of the segments; 

associating a first one of the segments with at least one other of the segments; and 

displaying an indication of the association of the first to the other of the segments thereby 
composing a program of the segments; 

wherein the program may be played in a sequence determined by a user in accordance 
with the associations. 

2. The method of Claim 1 , wherein the associations are each a hyperlink. 

3. The method of Claim 1, wherein the associations are each displayed as a spatial arrangement 
of the icons. 

4. The method of Claim 1 , wherein each icon when activated results in playing at least a portion 
of the segment represented by the icon. 

5. The method of Claim 1 , wherein each icon carries an identifier for the segment represented by 
the icon. 



-31- 



WO 03/021416 



PCT/US02/27820 



6. The method of Claim 1, wherein each displayed icon is designated as a first type or a second 
type, a plurality of the second type being associated with each of the first type. 

7. The method of Claim 1 , wherein each displayed icon is designated as a first type or a second 
type, a plurality of the second type associated with each of the first type or the second type. 

8. The method of Claim 1 , each icon representing content selected from the group consisting of 
audio, video, image, or text. 

9. The method of Claim 6 9 wherein the program when played by the user plays each of the 
segments represented by an icon of the first type in a predetermined order, and allows user 
selection of playing each of the segments represented by an icon of the second type. 

10. The method of Claim 7, wherein the program when played by the user plays each of the 
segments represented by an icon of the first type in a predetermined order, and allows user 
selection of playing each of the segments represented by an icon of the second type. 

1 1. A computer system having a graphical user interface for editing audio or video data on a 
display, comprising: 

a memory storing a plurality of segments of audio or video data, 

a portion assigning an icon to be displayed representing each of the stored segments; 

a portion which associates a first one of the segments with at least one other of the 
segments, and displays an indication of the association; 

a composer which composes a program of a set of the associated segments; and 
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wherein the composed programs may be played in a sequence determined by a user in 
accordance with the associations. 



12. The system of Claim 1 1, wherein the associations are each a hyperlink. 

13. The system of Claim 1 1 , wherein the associations are each displayed as a spatial 
arrangement of the icons. 

14. The system of Claim 11, wherein each icon when activated plays at least a portion of the 
segment represented by the icon. 

15. The system of Claim 11, wherein each icon carries an identifier for the segment represented 
by the icon. 

16. The system of Claim 11, wherein each displayed icon is designated as a first type or a 
second type, a plurality of the second type being associated with each of the first type. 

17. The system of Claim 11, wherein each displayed icon is designated as a first type or a 
second type, a plurality of the second type associated with each of the first type or the second 
type. 

18. The system of Claim 11, each icon representing content selected from the group consisting 
of audio, video, image, or text. 
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19. The system of Claim 16, wherein the program when played by the user plays each of the 
segments represented by an icon of the first type in a predetermined order, and allows user 
selection of playing each of the segments represented by an icon of the second type. 

20. The system of Claim 17, wherein the program when played by the user plays each of the 
segments represented by an icon of the first type in a predetermined order, and allows user 
selection of playing each of the segments represented by an icon of the second type. 

21. An improvement to a computer editing system, the system allowing composition of a 
program from a plurality of audio or video segments using a graphical user interface displaying 
an icon identifying each of the segments, comprising: 

assigning links between the segments by displaying a spatial arrangement of the icons; 

the spatial arrangement representing a link between a header segment and a detail 
segment wherein the program includes a plurality of header segments played in a 
predetermined order and a plurality of detail segments played in an order of election of a 
user in response to an indication of each link. 
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<Header> 1 

A gift Crowe about 

<Detail> 1 

Russel Crowe is taking his rock band, 30 
Odd Foot of Grunts, to Texas this summer 
for a concert celebrating the 15th birthday of 
Sydney Perry, the daughter of Texas Gov. 
Rick Perry. The Aug. 18 show at Stubb's 
BBQ in Austin will benefit the city Settlement 
Home for troubled youth and will kick off a 
brief U.S. tour by the Oscar winner's band. 
Crowe became friends with the governor last 
year when his band was in Austin recording 
an album and performing. 
<Header> 2 

Chelsea Clinton, dad score U2 tickets. 

<Detail> 2 

Bill Clinton and daughter Chelsea bounced 
to 112*3 beat at the Irish quartet's U.S. finale 
Friday night in East Rutherford, N.J. 
Arriving as the concert started under bright 
house lights, the former president shook 
hands with excited fans. U.N. Secretary 
General Kofi Annan also was spotted in the 
celebrity holding pen on the floor of the 
Continental Arena, along with Caroline 
Kennedy, model Christy Turlington and her 
fiance' actor Ed Burns and members of 
R.E.M. 
<Header> 3 

A sneak peek at Harry Potter' trailer 

<Detail> 3 

The new preview for the movie Harry Potter 
and the Sorcerer's Stone will be available 
Wednesday on AOL (keyword Harry Potter) 
to subscribers . The trailer will 
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k <Header> 1 

If you've read the critical reaction to PBS's 
10-part documentary Jazz (concluding this 
Wednesday at 9 p.m.), you know that Ken 
Burns and his crew of commentators pay 
scant attention to the post-Coltrane era. 
fDetail> 1 

The 1 7-p/us hour series devotes twice the 
amount of time to the four years between 
1935 and 1 939 as it does to the past four 
decades and hints that in recent years the 
public lost interest in jazz and the form has 
stagnated. Some critics have attributed this 
omission to the heavy hand of "senior 
creative consultant" Wynton Marsalis, who 
heads up what might be called the neocon 
movement of jazz, which, broadly stated, 
favors returning to codified jazz traditions 
over pushing the boundaries of the form. 
f <Header> 2 

Here's another, more innocuous theory. In 
Burns' series, jazz is inextricably tied to race 
and racial struggles, indeed, Burns has stated 
outright that Jazz, along with his series on 
the Civil War and baseball, is part of a trilogy 
on the theme of race in America. 
<Detail 2> 

While it's relatively easy to build an early 
history of jazz around an examination of 
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THE GOOD NEWS: Boron helps reduce 
risk of prostate cancer. Boron is 
found in nuts, wine, and fruits and 
veggies. 
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